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Invitation: Pair Production 
in e+e — Annihilation 


The main purpose of Part I of this book is to develop the basic calculational 
method of quantum field theory, the formalism of Feynman diagrams. We will 
then apply this formalism to computations in Quantum Electrodynamics, the 
quantum theory of electrons and photons. 

Quantum Electrodynamics (QED) is perhaps the best fundamental phys¬ 
ical theory we have. The theory is formulated as a set of simple equations 
(Maxwell’s equations and the Dirac equation) whose form is essentially deter¬ 
mined by relativistic invariance. The quantum-mechanical solutions of these 
equations give detailed predictions of electromagnetic phenomena from macro¬ 
scopic distances down to regions several hundred times smaller than the pro¬ 
ton. 

Feynman diagrams provide for this elegant theory an equally elegant pro¬ 
cedure for calculation: Imagine a process that can be carried out by electrons 
and photons, draw a diagram, and then use the diagram to write the mathe¬ 
matical form of the quantum-mechanical amplitude for that process to occur. 

In this first part of the book we will develop both the theory of QED 
and the method of Feynman diagrams from the basic principles of quantum 
mechanics and relativity. Eventually, we will arrive at a point where we can 
calculate observable quantities that are of great interest in the study of ele¬ 
mentary particles. But to reach our goal of deriving this simple calculational 
method, we must first, unfortunately, make a serious detour into formalism. 
The three chapters that follow this one are almost completely formal, and 
the reader might wonder, in the course of this development, where we are go¬ 
ing. We would like to partially answer that question in advance by discussing 
the physics of an especially simple QED process—one sufficiently simple that 
many of its features follow directly from physical intuition. Of course, this 
intuitive, bottom-up approach will contain many gaps. In Chapter 5 we will 
return to this process with the full power of the Feynman diagram formalism. 
Working from the top down, we will then see all of these difficulties swept 
away. 
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Figure 1.1. The annihilation reaction e + e —s- , shown in the center- 

of-mass frame. 


The Simplest Situation 

Since most particle physics experiments involve scattering, the most com¬ 
monly calculated quantities in quantum field theory are scattering cross sec¬ 
tions. We will now calculate the cross section for the simplest of all QED 
processes: the annihilation of an electron with its antiparticle, a positron, to 
form a pair of heavier leptons (such as muons). The existence of antiparticles 
is actually a prediction of quantum field theory, as we will discuss in Chapters 
2 and 3. For the moment, though, we take their existence as given. 

An experiment to measure this annihilation probability would proceed by 
firing a beam of electrons at a beam of positrons. The measurable quantity is 
the cross section for the reaction e + e _ —>■ p + p _ as a function of the center-of- 
mass energy and the relative angle 6 between the incoming electrons and the 
outgoing muons. The process is illustrated in Fig. 1.1. For simplicity, we work 
in the center-of-mass (CM) frame where the momenta satisfy p' = — p and 
k' = — k. We also assume that the beam energy E is much greater than either 
the electron or the muon mass, so that |p| = |p'| = |k| = |k'| = E = E cm /2. 
(We use boldface type to denote 3-vectors and ordinary italic type to denote 
4-vectors.) 

Since both the electron and the muon have spin 1/2, we must specify their 
spin orientations. It is useful to take the axis that defines the spin quantization 
of each particle to be in the direction of its motion; each particle can then 
have its spin polarized parallel or antiparallel to this axis. In practice, electron 
and positron beams are often unpolarized, and muon detectors are normally 
blind to the muon polarization. Hence we should average the cross section 
over electron and positron spin orientations, and sum the cross section over 
muon spin orientations. 

For any given set of spin orientations, it is conventional to write the 
differential cross section for our process, with the /i~ produced into a solid 
angle dfl, as 

do 1 I ,.,| 2 i\ 

dfi ~ 647r 2 F 2 m '\ M \ ■ ( • ) 
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The factor £1 -2 provides the correct dimensions for a cross section, since in 
our units (energy) -2 ~ (length) 2 . The quantity M is therefore dimensionless; 
it is the quantum-mechanical amplitude for the process to occur (analogous 
to the scattering amplitude / in nonrelativistic quantum mechanics), and 
we must now address the question of how to compute it from fundamental 
theory. The other factors in the expression are purely a matter of convention. 
Equation (1.1) is actually a special case, valid for CM scattering when the 
final state contains two massless particles, of a more general formula (whose 
form cannot be deduced from dimensional analysis) which we will derive in 
Section 4.5. 

Now comes some bad news and some good news. 

The bad news is that even for this simplest of QED processes, the exact 
expression for M is not known. Actually this fact should come as no sur¬ 
prise, since even in nonrelativistic quantum mechanics, scattering problems 
can rarely be solved exactly. The best we can do is obtain a formal expres¬ 
sion for M as a perturbation series in the strength of the electromagnetic 
interaction, and evaluate the first few terms in this series. 

The good news is that Feynman has invented a beautiful way to orga¬ 
nize and visualize the perturbation series: the method of Feynman diagrams. 
Roughly speaking, the diagrams display the flow of electrons and photons dur¬ 
ing the scattering process. For our particular calculation, the lowest-order term 
in the perturbation series can be represented by a single diagram, shown in 
Fig. 1.2. The diagram is made up of three types of components: external lines 
(representing the four incoming and outgoing particles), internal lines (repre¬ 
senting “virtual” particles, in this case one virtual photon), and vertices. It is 
conventional to use straight lines for fermions and wavy lines for photons. The 
arrows on the straight lines denote the direction of negative charge flow, not 
momentum. We assign a 4-momentum vector to each external line, as shown. 
In this diagram, the momentum q of the one internal line is determined by 
momentum conservation at either of the vertices: q = p + p' = k + k'. We 
must also associate a spin state (either “up” or “down”) with each external 
fermion. 

According to the Feynman rules , each diagram can be translated directly 
into a contribution to M . The rules assign a short algebraic factor to each el¬ 
ement of a diagram, and the product of these factors gives the value of the 
corresponding term in the perturbation series. Getting the resulting expres¬ 
sion for M into a form that is usable, however, can still be nontrivial. We 
will develop much useful technology for doing such calculations in subsequent 
chapters. But we do not have that technology yet, so to get an answer to our 
particular problem we will use some heuristic arguments instead of the actual 
Feynman rules. 

Recall that in quantum-mechanical perturbation theory, a transition am¬ 
plitude can be computed, to first order, as an expression of the form 


(final state) F[j [initial state), 


(1.2) 
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Figure 1.2. Feynman diagram for the lowest-order term in the e“*~e — — s- 
cross section. At this order the only possible intermediate state is a 
photon ( 7 ). 

where Hi is the “interaction” part of the Hamiltonian. In our case the initial 
state is |e + e“) and the final state is (p. + p _ |. But our interaction Hamiltonian 
couples electrons to muons only through the electromagnetic field (that is, 
photons), not directly. So the first-order result (1.2) vanishes, and we must go 
to the second-order expression 

M ~ (p+p-| Hj 1 7 ) 11 ( 7 | Hi |e+e-) /( . (1.3) 

This is a heuristic way of writing the contribution to M from the diagram in 
Fig. 1.2. The external electron lines correspond to the factor |e + e“); the ex¬ 
ternal muon lines correspond to (p + p _ |. The vertices correspond to Hi , and 
the internal photon line corresponds to the operator | 7 ) ( 7 |. We have added 
vector indices (p.) because the photon is a vector particle with four compo¬ 
nents. There are four possible intermediate states, one for each component, 
and according to the rules of perturbation theory we must sum over interme¬ 
diate states. Note that since the sum in (1.3) takes the form of a 4-vector dot 
product, the amplitude M will be a Lorentz-invariant scalar as long as each 
half of (1.3) is a 4-vector. 

Let us try to guess the form of the vector ( 7 | Hi \e + e~) fl . Since Hi cou¬ 
ples electrons to photons with a strength e (the electron charge), the matrix 
element should be proportional to e. Now consider one particular set of initial 
and final spin orientations, shown in Fig. 1.3. The electron and muon have 
spins parallel to their directions of motion; they are “right-handed”. The an¬ 
tiparticles, similarly, are “left-handed”. The electron and positron spins add 
up to one unit of angular momentum in the +z direction. Since Hi should 
conserve angular momentum, the photon to which these particles couple must 
have the correct polarization vector to give it this same angular momentum: 
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Figure 1.3. One possible set of spin orientations. The electron and the neg¬ 
ative muon are right-handed, while the positron and the positive muon are 
left-handed. 

e> 1 = (0,1, i, 0). Thus we have 

( 7 | |e + e“)" oce(0,l,i,0). (1.4) 

The muon matrix element should, similarly, have a polarization corre¬ 
sponding to one unit of angular momentum along the direction of the /j~ 
momentum k. To obtain the correct vector, rotate (1.4) through an angle 6 
in the xz-plane: 

( 7 | Hi \/j + /j~y‘ oc e (0, cos6,i, — sin#). (1.5) 

To compute the amplitude M, we complex-conjugate this vector and dot it 
into (1.4). Thus we find, for this set of spin orientations, 

M(RL -> RL) = -e 2 (1 + cos<9) . (1.6) 

Of course we cannot determine the overall factor by this method, but in (1.6) 
it happens to be correct, thanks to the conventions adopted in (1.1). Note 
that the amplitude vanishes for 6 = 180°, just as one would expect: A state 
whose angular momentum is in the +2 direction has no overlap with a state 
whose angular momentum is in the —2 direction. 

Next consider the case in which the electron and positron are both right- 
handed. Now their total spin angular momentum is zero, and the argument is 
more subtle. We might expect to obtain a longitudinally polarized photon with 
a Clebsch-Gordan coefficient of l/\/2, just as when we add angular momenta 
in three dimensions, |t^) = (l/\/2)(| j = 1 ,m = 0) + \j = 0 ,m = 0)). But we 
are really adding angular momenta in the four-dimensional Lorentz group, 
so we must take into account not only spin (the transformation properties of 
states under rotations), but also the transformation properties of states under 
boosts. It turns out, as we shall discuss in Chapter 3, that the Clebsch-Gordan 
coefficient that couples a 4-vector to the state je^e^) of massless fermions is 
zero. (For the record, the state is a superposition of scalar and antisymmetric 
tensor pieces.) Thus the amplitude M(RR —> RL) is zero, as are the eleven 
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other amplitudes in which either the initial or final state has zero total angular 
momentum. 

The remaining nonzero amplitudes can be found in the same way that we 
found the first one. They are 

M(RL ->■ LR) = -e 2 (1 - cos<9), 

M(LR —> RL) = —e 2 (1 — cos#), (1.7) 

M(LR ->■ LR) = -e 2 (1 + cos<9). 


Inserting these expressions into (1.1), averaging over the four initial-state spin 
orientations, and summing over the four final-state spin orientations, we find 


da 

dfl 


^^(i + cos 2 e), 


( 1 . 8 ) 


where a = e' 2 /An ~ 1/137. Integrating over the angular variables 0 and <p 
gives the total cross section, 


^total 


Ana 2 


(1.9) 


Results (1.8) and (1.9) agree with experiments to about 10%; almost all of 
the discrepancy is accounted for by the next term in the perturbation series, 
corresponding to the diagrams shown in Fig. 1.4. The qualitative features 
of these expressions—the angular dependence and the sharp decrease with 
energy—are obvious in the actual data. (The properties of these results are 
discussed in detail in Section 5.1.) 


Embellishments and Questions 

We obtained the angular distribution predicted by Quantum Electrodynamics 
for the reaction e + e _ —> by applying angular momentum arguments, 

with little appeal to the underlying formalism. However, we used the simpli¬ 
fying features of the high-energy limit and the center-of-mass frame in a very 
strong way. The analysis we have presented will break down when we relax 
any of our simplifying assumptions. So how does one perform general QED 
calculations? To answer that question we must return to the Feynman rules. 

As mentioned above, the Feynman rules tell us to draw the diagram(s) for 
the process we are considering, and to associate a short algebraic factor with 
each piece of each diagram. Figure 1.5 shows the diagram for our reaction, 
with the various assignments indicated. 

For the internal photon line we write —ig t ,u/q 2 , where g liv is the usual 
Minkowski metric tensor and q is the 4-momentum of the virtual photon. This 
factor corresponds to the operator |y) (y| in our heuristic expression (1.3). 

For each vertex we write corresponding to Hi in (1.3). The objects 

7 ; ' are a set of four 4x4 constant matrices. They do the “addition of angular 
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Figure 1.4. Feynman diagrams that contribute to the a 3 term in the 
e + e - —S- cross section. 


Figure 1.5. Diagram of Fig. 1.2, with expressions corresponding to each 
vertex, internal line, and external line. 

momentum” for us, coupling a state of two spin-1/2 particles to a vector 
particle. 

The external lines carry expressions for four-component column-spinors 
u,v, or row-spinors u,v. These are essentially the momentum-space wavefunc- 
tions of the initial and final particles, and correspond to |e + e“) and (p. + p _ | 
in (1.3). The indices s, s ', r, and r' denote the spin state, either up or down. 
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We can now write down an expression for M , reading everything straight 
off the diagram: 


M = v s (p)(—ieY‘)u s (p) 




u r (k) (— ie^ v )v r (k 1 ) 


1P / / 

= — (p)lV{p)){u r {k)-f lx v r (k 1 )). 


( 1 . 10 ) 


It is instructive to compare this in detail with Eq. (1.3). 

To derive the cross section (1.8) from (1.10), we could return to the an¬ 
gular momentum arguments used above, supplemented with some concrete 
knowledge about 7 matrices and Dirac spinors. We will do the calculation 
in this manner in Section 5.2. There are, however, a number of useful tricks 
that can be employed to manipulate expressions like ( 1 . 10 ), especially when 
one wants to compute only the unpolarized cross section. Using this “Feyn¬ 
man trace technology” (so-called because one must evaluate traces of prod¬ 
ucts of 7 -matrices), it isn’t even necessary to have explicit expressions for 
the 7 -matrices and Dirac spinors. The calculation becomes almost completely 
mindless, and the answer ( 1 . 8 ) is obtained after less than a page of algebra. 
But since the Feynman rules and trace technology are so powerful, we can 
also relax some of our simplifying assumptions. To conclude this section, let 
us discuss several ways in which our calculation could have been more difficult. 

The easiest restriction to relax is that the muons be massless. If the beam 
energy is not much greater than the mass of the muon, all of our predic¬ 
tions should depend on the ratio m^/E C m- (Since the electron is 200 times 
lighter than the muon, it can be considered massless whenever the beam en¬ 
ergy is large enough to create muons.) Using Feynman trace technology, it is 
extremely easy to restore the muon mass to our calculation. The amount of 
algebra is increased by about fifty percent, and the relation ( 1 . 1 ) between the 
amplitude and the cross section must be modified slightly, but the answer is 
worth the effort. We do this calculation in detail in Section 5.1. 

Working in a different reference frame is also easy; the only modification 
is in the relation (1.1) between the amplitude and the cross section. Or one 
can simply perform a Lorentz transformation on the CM result, boosting it 
to a different frame. 

When the spin states of the initial and/or final particles are known and 
we still wish to retain the muon mass, the calculation becomes somewhat 
cumbersome but no more difficult in principle. The trace technology can be 
generalized to this case, but it is often easier to evaluate expression ( 1 . 10 ) 
directly, using the explicit values of the spinors u and v. 

Next one could compute cross sections for different processes. The process 
e + e _ —)• e + e _ , known as Bhabha scattering , is more difficult because there is 
a second allowed diagram (see Fig. 1.6). The amplitudes for the two diagrams 
must first be added, then squared. 

Other processes contain photons in the initial and/or final states. The 
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Figure 1.6. The two lowest-order diagrams for Bhablia scattering, e+e —5- 


Figure 1.7. The two lowest-order diagrams for Compton scattering. 

paradigm example is Compton scattering, for which the two lowest-order di¬ 
agrams are shown in Fig. 1.7. The Feynman rules for external photon lines 
and for internal electron lines are no more complicated than those we have 
already seen. We discuss Compton scattering in detail in Section 5.5. 

Finally we could compute higher-order terms in the perturbation series. 
Thanks to Feynman, the diagrams are at least easy to draw; we have seen 
those that contribute to the next term in the e + e _ —> cross section in 

Fig. 1.4. Remarkably, the algorithm that assigns algebraic factors to pieces 
of the diagrams holds for all higher-order contributions, and allows one to 
evaluate such diagrams in a straightforward, if tedious, way. The computation 
of the full set of nine diagrams is a serious chore, at the level of a research 
paper. 

In this book, starting in Chapter 6, we will analyze much of the physics 
that arises from higher-order Feynman diagrams such as those in Fig. 1.4. 
We will see that the last four of these diagrams, which involve an additional 
photon in the final state, are necessary because no detector is sensitive enough 
to notice the presence of extremely low-energy photons. Thus a final state 
containing such a photon cannot be distinguished from our desired final state 
of just a muon pair. 
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The other five diagrams in Fig. 1.4 involve intermediate states of several 
virtual particles rather than just a single virtual photon. In each of these di¬ 
agrams there will be one virtual particle whose momentum is not determined 
by conservation of momentum at the vertices. Since perturbation theory re¬ 
quires us to sum over all possible intermediate states, we must integrate over 
all possible values of this momentum. At this step, however, a new difficulty 
appears: The loop-momentum integrals in the first three diagrams, when per¬ 
formed naively, turn out to be infinite. We will provide a fix for this problem, 
so that we get finite results, by the end of Part I. But the question of the 
physical origin of these divergences cannot be dismissed so lightly; that will 
be the main subject of Part II of this book. 

We have discussed Feynman diagrams as an algorithm for performing 
computations. The chapters that follow should amply illustrate the power of 
this tool. As we expose more applications of the diagrams, though, they be¬ 
gin to take on a life and significance of their own. They indicate unsuspected 
relations between different physical processes, and they suggest intuitive ar¬ 
guments that might later be verified by calculation. We hope that this book 
will enable you, the reader, to take up this tool and apply it in novel and 
enlightening ways. 
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The Klein-Gordon Field 


2.1 The Necessity of the Field Viewpoint 

Quantum field theory is the application of quantum mechanics to dynamical 
systems of fields, in the same sense that the basic course in quantum mechanics 
is concerned mainly with the quantization of dynamical systems of particles. 
It is a subject that is absolutely essential for understanding the current state 
of elementary particle physics. With some modification, the methods we will 
discuss also play a crucial role in the most active areas of atomic, nuclear, 
and condensed-matter physics. In Part I of this book, however, our primary 
concern will be with elementary particles, and hence relativistic fields. 

Given that we wish to understand processes that occur at very small 
(quantum-mechanical) scales and very large (relativistic) energies, one might 
still ask why we must study the quantization of fields. Why can’t we just 
quantize relativistic particles the way we quantized nonrelativistic particles? 

This question can be answered on a number of levels. Perhaps the best 
approach is to write down a single-particle relativistic wave equation (such as 
the Klein-Gordon equation or the Dirac equation) and see that it gives rise to 
negative-energy states and other inconsistencies. Since this discussion usually 
takes place near the end of a graduate-level quantum mechanics course, we will 
not repeat it here. It is easy, however, to understand why such an approach 
cannot work. We have no right to assume that any relativistic process can be 
explained in terms of a single particle, since the Einstein relation E = me 2 
allows for the creation of particle-antiparticle pairs. Even when there is not 
enough energy for pair creation, multiparticle states appear, for example, as 
intermediate states in second-order perturbation theory. We can think of such 
states as existing only for a very short time, according to the uncertainty 
principle A E ■ At = h. As we go to higher orders in perturbation theory, 
arbitrarily many such “virtual” particles can be created. 

The necessity of having a multiparticle theory also arises in a less obvious 
way, from considerations of causality. Consider the amplitude for a free particle 
to propagate from xq to x: 


U(t) = <x| e lHt |x 0 ). 


13 



14 


Chapter 2 The Klein-Gordon Field 


In nonrelativistic quantum mechanics we have E = p 2 /2m, so 
U(t) = (x e 2, " : ' x„) 

= ./'(0 H - r i(pa/2m,i | p )( p | x 0 ) 

i 

“ ( 2^)3 

TO g*m(x-x 0 ) 2 /2i 

2nit) 

This expression is nonzero for all x and t, indicating that a particle can prop¬ 
agate between any two points in an arbitrarily short time. In a relativistic 
theory, this conclusion would signal a violation of causality. One might hope 
that using the relativistic expression E = yjp 2 + to 2 would help, but it does 
not. In analogy with the nonrelativistic case, we have 

U(t) = (x| e-’Vp 2 +"‘ 2 |x 0 ) 

1 

“ (2^)3 

- 7 f dpp sin(/) x x„ )< 

27T-1X - x 0 | J 

o 

This integral can be evaluated explicitly in terms of Bessel functions.* We 
will content ourselves with looking at its asymptotic behavior for x 2 3 > t 2 
(well outside the light-cone), using the method of stationary phase. The phase 
function px — t^/p 2 + m 2 has a stationary point at p = imx/\Jx 2 —t 2 . We may 
freely push the contour upward so that it goes through this point. Plugging 
in this value for p. we find that, up to a rational function of x and t, 

U(t) ~ e ~mVx' 2 -t ' 2 ^ 

Thus the propagation amplitude is small but nonzero outside the light-cone, 
and causality is still violated. 

Quantum field theory solves the causality problem in a miraculous way, 
which we will discuss in Section 2.4. We will find that, in the multiparticle 
field theory, the propagation of a particle across a spacelike interval is indis¬ 
tinguishable from the propagation of an antiparticle in the opposite direction 
(see Fig. 2.1). When we ask whether an observation made at point ;co can 
affect an observation made at point x, we will find that the amplitudes for 
particle and antiparticle propagation exactly cancel—so causality is preserved. 

Quantum field theory provides a natural way to handle not only multipar¬ 
ticle states, but also transitions between states of different particle number. 
It solves the causality problem by introducing antiparticles, then goes on to 





j<Pp '' ,ip ' 2m>l ■ e* p d x - x o) 


*See Gradshteyn and Ryzliik (1980), #3.914. 
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Figure 2.1. Propagation from xo to x in one frame looks like propagation 
from x to a’o in another frame. 

explain the relation between spin and statistics. But most important, it pro¬ 
vides the tools necessary to calculate innumerable scattering cross sections, 
particle lifetimes, and other observable quantities. The experimental confir¬ 
mation of these predictions, often to an unprecedented level of accuracy, is 
our real reason for studying quantum field theory. 


2.2 Elements of Classical Field Theory 

In this section we review some of the formalism of classical field theory that 
will be necessary in our subsequent discussion of quantum field theory. 

Lagrangian Field Theory 

The fundamental quantity of classical mechanics is the action, 5, the time 
integral of the Lagrangian, L. In a local field theory the Lagrangian can be 
written as the spatial integral of a Lagrangian density, denoted by C, which is 
a function of one or more fields <p(x) and their derivatives d^cp. Thus we have 

S = J Lett = j £(</>, d 4 x. (2.1) 

Since this is a book on field theory, we will refer to C, simply as the Lagrangian. 

The principle of least action states that when a system evolves from one 
given configuration to another between times ti and to, it does so along the 
“path” in configuration space for which S is an extremum (normally a mini¬ 
mum). We can write this condition as 


0 = SS 



The last term can be turned into a surface integral over the boundary of the 
four-dimensional spacetime region of integration. Since the initial and final 
field configurations are assumed given, 5cj> is zero at the temporal beginning 



16 


Chapter 2 The Klein-Gordon Field 


and end of this region. If we restrict our consideration to deformations 8(f) that 
vanish on the spatial boundary of the region as well, then the surface term is 
zero. Factoring out the 8(f> from the first two terms, we note that, since the 
integral must vanish for arbitrary 8(j>, the quantity that multiplies 8(f) must 
vanish at all points. Thus we arrive at the Euler-Lagrange equation of motion 
for a field, 


(mv) 



(2.3) 


If the Lagrangian contains more than one field, there is one such equation for 
each. 


Hamiltonian Field Theory 


The Lagrangian formulation of field theory is particularly suited to relativistic 
dynamics because all expressions are explicitly Lorentz invariant. Nevertheless 
we will use the Hamiltonian formulation throughout the first part of this 
book, since it will make the transition to quantum mechanics easier. Recall 
that for a discrete system one can define a conjugate momentum p = dL/dq 
(where q = dq/dtf) for each dynamical variable q. The Hamiltonian is then 
H = J2 PQ ~ L. The generalization to a continuous system is best understood 
by pretending that the spatial points x are discretely spaced. We can define 


P(x) = 




d l 

d <Mx) 
d 

<9<p(x) 


= shI myUiy)) 

^£(<p(y),<p(y))d 3 !/ 


d 3 y 


= ir(x)d 3 x, 


where 


7T(x) 


dC 

<9</>(x) 


(2.4) 


is called the momentum density conjugate to d>(x). Thus the Hamiltonian can 
be written 


H = ^p(x)d>(x) - L. 


X 


Passing to the continuum, this becomes 

H = J d 3 x [7r(x)<)>(x) — £] = j d 3 xTL. (2.5) 

We will rederive this expression for the Hamiltonian density % near the end 
of this section, using a different method. 

As a simple example, consider the theory of a single field <f>(x), governed 
by the Lagrangian 

L = \cf) 2 - 7j(V<)>) 2 - \m 2 (jr 

= Hd tl (f >) 2 - \m 2 (f 2 . 


( 2 . 6 ) 
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For now we take 0 to be a real-valued field. The quantity m will be interpreted 
as a mass in Section 2.3, but for now just think of it as a parameter. From 
this Lagrangian the usual procedure gives the equation of motion 

- V 2 +m 2 ^j <p = 0 or {d 1 '+ m 2 )<p = 0, (2.7) 

which is the well-known Klein-Gordon equation. (In this context it is a classi¬ 
cal field equation, like Maxwell’s equations—not a quantum-mechanical wave 
equation.) Noting that the canonical momentum density conjugate to <f>(x) is 
n(x ) = 4>(x), we can also construct the Hamiltonian: 

H = Jd 3 xU = Jd 3 x [±tt 2 + 4(V0) 2 + \m 2 cf 2 ]. (2.8) 

We can think of the three terms, respectively, as the energy cost of “moving” 
in time, the energy cost of “shearing” in space, and the energy cost of having 
the field around at all. We will investigate this Hamiltonian much further in 
Sections 2.3 and 2.4. 

Noether’s Theorem 

Next let us discuss the relationship between symmetries and conservation 
laws in classical field theory, summarized in Noether’s theorem. This theorem 
concerns continuous transformations on the fields <t>, which in infinitesimal 
form can be written 


4>{x) —> <j>'(x) = <p(x) -F aA <j>(x), (2.9) 

where a is an infinitesimal parameter and A <j> is some deformation of the field 
configuration. We call this transformation a symmetry if it leaves the equa¬ 
tions of motion invariant. This is insured if the action is invariant under (2.9). 
More generally, we can allow the action to change by a surface term, since the 
presence of such a term would not affect our derivation of the Euler-Lagrange 
equations of motion (2.3). The Lagrangian, therefore, must be invariant un¬ 
der (2.9) up to a 4-divergence: 

C(x) -f C(x) + ad^J ,l (x), (2.10) 

for some J Let us compare this expectation for AT to the result obtained 
by varying the fields: 

. d £ ( d£ \ 

o ' iC =M <a48,)+ h(3^)J ,{ 



( 2 . 11 ) 
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The second term vanishes by the Euler-Lagrange equation (2.3). We set the 
remaining term equal to ad^J^ and find 

dC 

d ll f(x)= 0 , for = (2.12) 

o{d^4>) 

(If the symmetry involves more than one field, the first term of this expression 
for j fl (x) should be replaced by a sum of such terms, one for each field.) 
This result states that the current j^{x) is conserved. For each continuous 
symmetry of £, we have such a conservation law. 

The conservation law can also be expressed by saying that the charge 

Q = J j°d 3 x (2.13) 

all space 

is a constant in time. Note, however, that the formulation of field theory in 
terms of a local Lagrangian density leads directly to the local form of the 
conservation law, Eq. (2.12). 

The easiest example of such a conservation law arises from a Lagrangian 
with only a kinetic term: £ = The transformation <p —»• <j> + a, where 

a is a constant, leaves £ unchanged, so we conclude that the current j' 1 = d 
is conserved. As a less trivial example, consider the Lagrangian 

£ = IM 2 -"> t |#f j (2.14) 

where <f> is now a complex-valued field. You can easily show that the equation 
of motion for this Lagrangian is again the Klein-Gordon equation, (2.7). This 
Lagrangian is invariant under the transformation 0 —>■ e la <j)\ for an infinitesi¬ 
mal transformation we have 

nAo /no: aA 0 * = —ia<t>*. (2.15) 

(We treat <j> and q>* as independent fields. Alternatively, we could work with 
the real and imaginary parts of <f>.) It is now a simple matter to show that the 
conserved Noether current is 

f =i[(&' 0 *) 0 -^(d» 0 )\. (2.16) 

(The overall constant has been chosen arbitrarily.) You can check directly that 
the divergence of this current vanishes by using the Klein-Gordon equation. 
Later we will add terms to this Lagrangian that couple 0 to an electromagnetic 
field. We will then interpret as the electromagnetic current density carried 
by the field, and the spatial integral of j° as its electric charge. 

Noether’s theorem can also be applied to spacetime transformations such 
as translations and rotations. We can describe the infinitesimal translation 

a" ->• x 1 ' - o" 

alternatively as a transformation of the field configuration 
<p{x) —> ifi(x + a ) = <p{x) + af d tl 0 {. r). 
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The Lagrangian is also a scalar, so it must transform in the same way: 


C -> C + (<&"„£). 

Comparing this equation to (2.10), we see that we now have a nonzero 
Taking this into account, we can apply the theorem to obtain four separately- 
conserved currents: 

dC 

T,l - = 7^^ d ^- £6l ‘- (2-17) 

This is precisely the stress-energy tensor, also called the energy-momentum 
tensor, of the field <p. The conserved charge associated with time translations 
is the Hamiltonian: 

H = j T 00 d 3 x = j H (fx. (2.18) 

By computing this quantity for the Klein-Gordon field, one can recover the 
result (2.8). The conserved charges associated with spatial translations are 


Pi = 



(2.19) 


and we naturally interpret this as the (physical) momentum carried by the 
field (not to be confused with the canonical momentum). 


2.3 The Klein-Gordon Field as Harmonic Oscillators 

We begin our discussion of quantum field theory with a rather formal treat¬ 
ment of the simplest type of field: the real Klein-Gordon field. The idea is to 
start with a classical field theory (the theory of a classical scalar field gov¬ 
erned by the Lagrangian (2.6)) and then “quantize” it, that is, reinterpret the 
dynamical variables as operators that obey canonical commutation relations^ 
We will then “solve” the theory by finding the eigenvalues and eigenstates of 
the Hamiltonian, using the harmonic oscillator as an analogy. 

The classical theory of the real Klein-Gordon field was discussed briefly 
(but sufficiently) in the previous section; the relevant expressions are given in 
Eqs. (2.6), (2.7), and (2.8). To quantize the theory, we follow the same pro¬ 
cedure as for any r other dynamical system: We promote <p and 7r to operators, 
and impose suitable commutation relations. Recall that for a discrete system 
of one or more particles the commutation relations are 

[//;• Pj\ — i&ij j 
[Qi,Qj\ = [Pi,Pj] = 0. 

fThis procedure is sometimes called second quantization, to distinguish the re¬ 
sulting Klein-Gordon equation (in which <f> is an operator) from the old one-particle 
Klein-Gordon equation (in which <p was a wavefunction). In this book we never adopt 
the latter point of view; we start with a classical equation (in which <p is a classical 
field) and quantize it exactly once. 
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For a continuous system the generalization is quite natural; since f (x) is the 
momentum density , we get a Dirac delta function instead of a Kronecker delta: 

[0(x),7r(y)] =ic5 (3) (x-y); ^ 

[^(x),^(y)] = [7r(x),7r(y)] =0. 

(For now we work in the Schrodinger picture where <j> and n do not depend 
on time. When we switch to the Heisenberg picture in the next section, these 
“equal time” commutation relations will still hold provided that both opera¬ 
tors are considered at the same time.) 

The Hamiltonian, being a function of <j> and it, also becomes an operator. 
Our next task is to find the spectrum from the Hamiltonian. Since there is 
no obvious way to do this, let us seek guidance by writing the Klein-Gordon 
equation in Fourier space. If we expand the classical Klein-Gordon field as 

d>(x, t) = J e* p x <t>( P, t) 


(with <p*( p) 
becomes 


d>(—p) so that c6(x) is real), the Klein-Gordon equation (2.7) 


'a 2 

dt 2 


+ (|p| 2 +m 2 ) 


4>{p,t) = 0. 


( 2 . 21 ) 


This is the same as the equation of motion for a simple harmonic oscillator 
with frequency 

Wp = y/|p| 2 + m 2 . (2.22) 


The simple harmonic oscillator is a system whose spectrum we already 
know how to find. Let us briefly recall how it is done. We write the Hamiltonian 
as 

#sho = hp 2 + h^ 2 (t> 2 ■ 

To find the eigenvalues of Hguo, we write <j> and p in terms of ladder operators: 

<t> = -/= (a + « t ); P = - f,t )- ( 2 - 23 ) 

\JlLO V ^ 

The canonical commutation relation [4>,p] = i is equivalent to 

[a, a +] = 1. (2.24) 

The Hamiltonian can now be rewritten 


#sho = a + y). 

The state |0) such that a |0) = 0 is an eigenstate of H with eigenvalue 
the zero-point energy. Furthermore, the commutators 

[Hsno,a^] =wa t , [FfsHO,a] =-toa 

malce it easy to verify that the states 

\n) = (« f )" |0) 
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are eigenstates of Hsho with eigenvalues (n + These states exhaust the 
spectrum. 

We can find the spectrum of the Klein-Gordon Hamiltonian using the 
same trick, but now each Fourier mode of the field is treated as an independent 


oscillator with its own a and a). In analogy with (2.23) we write 

^ = 17 ^ < 225 > 

’ r|x) = /(£y ( “' :) \/¥(“ pe ' PX “" ie " PX )- (226) 

The inverse expressions for a v and a* in terms of <f> and ir are easy to derive 
but rarely needed. In the calculations below we will find it useful to rearrange 
(2.25) and (2.26) as follows: 

0(x) = Jwfi 71^ +a ^ e ‘ p X; (2 - 27) 

7r(x) = / (0 ( - i) Vr (ap ■ ai p )eiP ' x - (2 - 28) 

The commutation relation (2.24) becomes 

[a P ,a,jy] = (2tt) 3 (5 (3) (p - p'), (2.29) 

from which you can verify that the commutator of <f> and n works out correctly: 

_ j d 3 p d 3 p' -l ru ^/r, I _ r f l\ i{p.x+p'.x') 

(2tt) 6 2 Y LUp V L a —F> ’ a,p J l a V ’ a -p' J ) e 

= iS (3 \x — x'). (2.30) 


(If computations such as this one and the next are unfamiliar to you, please 
work them out carefully; they are quite easy after a little practice, and are 
fundamental to the formalism of the next two chapters.) 

We are now ready to express the Hamiltonian in terms of ladder operators. 
Starting from its expression (2.8) in terms of <p and %, we have 



The second term is proportional to (5(0), an infinite c-number. It is simply 
the sum over all modes of the zero-point energies w p /2, so its presence is 
completely expected, if somewhat disturbing. Fortunately, this infinite energy 
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shift cannot be detected experimentally, since experiments measure only en¬ 
ergy differences from the ground state of H. We will therefore ignore this 
infinite constant term in all of our calculations. It is possible that this en¬ 
ergy shift of the ground state could create a problem at a deeper level in the 
theory; we will discuss this matter in the Epilogue. 

Using this expression for the Hamiltonian in terms of a p and a.J,, it is easy 
to evaluate the commutators 

[H, Op] = Wpt/l,; [H,a p ] = -uj p a p . (2.32) 

We can now write down the spectrum of the theory, just as for the harmonic 
oscillator. The state |0) such that a p |0) = 0 for all p is the ground state or 
vacuum , and has E = 0 after we drop the infinite constant in (2.31). All other 
energy eigenstates can be built by acting on |0) with creation operators. In 
general, the state a^a.^ • • • |0) is an eigenstate of H with energy cv p + cj q + • • •. 
These states exhaust the spectrum. 

Having found the spectrum of the Hamiltonian, let us try to interpret its 
eigenstates. From (2.19) and a calculation similar to (2.31) we can write down 
the total momentum operator, 

P = ~ J d 3 x 7r(x)V0(x) = j P a p a P- ( 2 - 33 ) 

So the operator creates momentum p and energy u> p = \/|p| 2 + m?. Sim¬ 
ilarly, the state • • • |0) has momentum p + q + • • •. It is quite natural to 
call these excitations particles , since they are discrete entities that have the 
proper relativistic energy-momentum relation. (By a particle we do not mean 
something that must be localized in space; a.J, creates particles in momentum 
eigenstates.) From now on we will refer to uj p as E p (or simply E), since it 
really is the energy of a particle. Note, by the way, that the energy is always 
positive: E p = +\/|p| 2 + m' 2 . 

This formalism also allows us to determine the statistics of our particles. 
Consider the two-particle state a^c/^ |0). Since a* and a* commute, this state 
is identical to the state |0) in which the two particles are interchanged. 
Moreover, a single mode p can contain arbitrarily many particles (just as a 
simple harmonic oscillator can be excited to arbitrarily high levels). Thus we 
conclude that Klein-Gordon particles obey Bose-Einstein statistics. 

We naturally choose to normalize the vacuum state so that (0|0) = 1. 
The one-particle states |p) oc a) p |0) will also appear quite often, and it is 
worthwhile to adopt a convention for their normalization. The simplest nor¬ 
malization (p|q) = (27r) 3 <5 (3) (p — q) (which many books use) is not Lorentz 
invariant, as we can demonstrate by considering the effect of a boost in the 
3-direction. Under such a boost we have p' 3 = j(p 3 + /3E), E' = ~/(E + /3p 3 ). 
Using the delta function identity 

1 

l/'(*o)| 


S{f(x) - f(x 0 )) 


S(x - Xq), 


(2.34) 
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we can compute 

d- ( 3 >(p-q)=^V-q')-f^ 

dp3 

= « <J V-q'b(l+ftg) 

= d (3) (p' - q!)-^(E + flp 3 ) 

= ^V-q')f- 

The problem is that volumes are not invariant under boosts; a box whose 
volume is V in its rest frame has volume V fa in a boosted frame, due to 
Lorentz contraction. But from the above calculation, we see that the quantity 
Efa^ip-q) is Lorentz invariant. We therefore define 

|p} = v/2^4|0>, (2.35) 

so that 

<p|q) =2£ p (2 7 r) 3 <5 (3 )(p-q). (2.36) 

(The factor of 2 is unnecessary, but is convenient because of the factor of 2 in 
Eq. (2.25).) 

On the Hilbert space of quantum states, a Lorentz transformation A will 
be implemented as some unitary operator U( A). Our normalization condition 
(2.35) then implies that 

U( A) |p) = |Ap). (2.37) 

If we prefer to think of this transformation as acting on the operator a£, we 
can also write 

U(A)a^U-\A)=.[^ a \ p . (2.38) 


With this normalization we must divide by 2 E p in other places. For ex¬ 
ample, the completeness relation for the one-particle states is 

(l)l-pa rt ic 1 e = /(0iP)^-( Pi- (2-39) 


where the operator on the left is simply the identity within the subspace of 
one-particle states, and zero in the rest of the Hilbert space. Integrals of this 
form will occur quite often; in fact, the integral 


f d 3 p 1 

J (2tt) 3 2£p 


/ 


d 4 p 


(27 r)S(p 2 



(2.40) 


is a Lorentz-invariant 3-momentum integral, in the sense that if f(p) is 
Lorentz-invariant, so is f d 3 p f(p)/(‘2E p ). The integration can be thought of 
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Figure 2.2. The Lorentz-invariant 3-momentum integral is over the upper 
branch of the hyperboloid p 2 = m 2 . 

as being over the p° > 0 branch of the hyperboloid p 2 = m 2 in 4-momentum 
space (see Fig. 2.2). 

Finally let us consider the interpretation of the state </>(x) |0). From the 
expansion (2.25) we see that 

« i >ro=/<s75r"*~ |p> (2 - 41) 

is a linear superposition of single-particle states that have well-defined mo¬ 
mentum. Except for the factor 1/2 E p , this is the same as the familiar nonrel- 
ativistic expression for the eigenstate of position |x); in fact the extra factor 
is nearly constant for small (nonrelativistic) p. We will therefore put forward 
the same interpretation, and claim that the operator </>(x), acting on the vac¬ 
uum, creates a particle at position x. This interpretation is further confirmed 
when we compute 

<0| </(x) |p) = <o| J (2^1 J Ep , i"""*'* + «P' e_,:p '’ x ) \/2^«P 1°) 

= e ,p x . (2.42) 

We can interpret this as the position-space representation of the single-particle 
wavefunction of the state |p), just as in nonrelativistic quantum mechanics 
(x|p) oc e* p x is the wavefunction of the state |p). 
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In the previous section we quantized the Klein-Gordon field in the Schrodinger 
picture, and interpreted the resulting theory in terms of relativistic particles. 
In this section we will switch to the Heisenberg picture, where it will be easier 
to discuss time-dependent quantities and questions of causality. After a few 
preliminaries, we will return to the question of acausal propagation raised in 
Section 2.1. We will also derive an expression for the Klein-Gordon propagator, 
a crucial part of the Feynman rules to be developed in Chapter 4. 

In the Heisenberg picture, we make the operators 0 and n time-dependent 
in the usual way: 

oU) = = e iHt 0 (x)e- iHt , (2.43) 

and similarly for ir(x) = n (x,i). The Heisenberg equation of motion, 

i^-0 = [0,H], (2.44) 

allows us to compute the time dependence of 4 > and n: 

= [d>(x,t),y d 3 ip'|4® ? (x',f) + |(V^(x',f))“ + |mV 2 (x',i)|] 

= /rfV(i«<»'<x-x' Mx'.i)) 

= i-( X, /): 

i^n{x,t) = [»(x,t),| dV j|7r 2 (x',i) + |?)>(x',f) (-V 2 + m 2 )^(x',f)|j 

= -«(-V 2 + m 2 )d»(x,f). 

Combining the two results gives 

^ 0 =(V 2 -m 2 ) 0 , (2.45) 

which is just the Klein-Gordon equation. 

We can better understand the time dependence of 0{x) and n(x) by writ¬ 
ing them in terms of creation and annihilation operators. First note that 

H c/p — c/p (H E-p ), 

and hence 

H n a p = a p (H -E p ) n , 

for any n. A similar relation (with — replaced by +) holds for c/f . Thus we 
have derived the identities 


JHt 


Clry& 


—iHt 


— Clr\C 


- iE-ryt 


e iHt a) p e~ iHt = a f p e iE ^, 


(2.46) 
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which we can use on expression (2.25) for <p(x) to find the desired expression 
for the Heisenberg operator (j>{x), according to (2.43). (We will always use the 
symbols a v and to represent the time-independent, Schrodinger-picture 
ladder operators.) The result is 


= / W?7krS a * e "" + 


(2tt) 3 e; 


P°=E p 


d 

71-(x,*) = wy"ix./). 


(2.47) 


It is worth mentioning that we can perform the same manipulations with 
P instead of H to relate <p(x) to d>(0). In analogy with (2.46), one can show 

(2.48) 


e _;p X a p e* p x = « p c’ p x , 


«- <p - x 4e <p ' x = »/J,« ipx . 


and therefore 

4>(x) = e i{Ht - p -* ) 4>(0)e- i{Ht - p -^ 
= ' ,Vf . 


(2.49) 


where P'' = (H , P). (The notation here is confusing but standard. Remember 
that P is the momentum operator, whose eigenvalue is the total momentum of 
the system. On the other hand, p is the momentum of a single Fourier mode 
of the field, which we interpret as the momentum of a particle in that mode. 
For a one-particle state of well-defined momentum, p is the eigenvalue of P.) 

Equation (2.47) makes explicit the dual particle and wave interpretations 
of the quantum field <p(x). On the one hand, <p(x) is written as a Hilbert space 
operator, which creates and destroys the particles that are the quanta of field 
excitation. On the other hand, (f>(x ) is written as a linear combination of solu¬ 
tions (e w ' x and e~ w ' :t ) of the Klein-Gordon equation. Both signs of the time 
dependence in the exponential appear: We find both e~ w 4 and e +tp 4 , al¬ 
though p° is always positive. If these were single-particle wavefunctions, they 
would correspond to states of positive and negative energy; let us refer to 
them more generally as positive- and negative-frequency modes. The connec¬ 
tion between the particle creation operators and the waveforms displayed here 
is always valid for free quantum fields: A positive-frequency solution of the 
field equation has as its coefficient the operator that destroys a particle in 
that single-particle wavefunction. A negative-frequency solution of the field 
equation, being the Hermitian conjugate of a positive-frequency solution, has 
as its coefficient the operator that creates a particle in that positive-energy 
single-particle wavefunction. In this way, the fact that relativistic wave equa¬ 
tions have both positive- and negative-frequency solutions is reconciled with 
the requirement that a sensible quantum theory contain only positive excita¬ 
tion energies. 
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Causality 

Now let us return to the question of causality raised at the beginning of this 
chapter. In our present formalism, still working in the Heisenberg picture, the 
amplitude for a particle to propagate from y to x is (0| <p(x)<p(y) |0). We will 
call this quantity D(x — y). Each operator 0 is a sum of a and a) operators, 
but only the term (0| a v a^ |0) = (27r) 3 <5 (3) (p — q) survives in this expression. 
It is easy to check that we are left with 

D{x - y) = <0| mm |0> = /^i^ r ' " • (2 ' 50) 

We have already argued in (2.40) that integrals of this form are Lorentz in¬ 
variant. Let us now evaluate this integral for some particular values of x — y. 

First consider the case where the difference x — y is purely in the time- 
direction: x° — y° = t, x — y = 0. (If the interval from y to x is timelike, there 
is always a frame in which this is the case.) Then we have 


D(x-y) = 


4i r 


OO 

/ 


dp ■ 


( 2 n) 3 J 2^ p 2+m 2 
o 

CO 

bS dE ^ 


0 -iy/p 2 -\-r 


2 - m 2 e~ iEt 


(2.51) 


~ e 

t^fOO 


Next consider the case where x — y is purely spatial: :r° —y° = 0, x — y = r. 
The amplitude is then 


D(x-y) = 


d 3 p 1 
(27r) 3 2E p 

OO 

2tt f 

dp 


(2tt) : 


/ 


0 zpr 


p2 e ipr _ e ~ipr 


2E P ipr 


—i 

2(2tt) 2 


CO 

/ 


dp 


pe 


ipr 


sjp 2 -f TO 2 


The integrand, considered as a complex function of p, has branch cuts on the 
imaginary axis starting at ±im (see Fig. 2.3). To evaluate the integral we 
push the contour up to wrap around the upper branch cut. Defining p = —ip, 
we obtain 


CO 

1 /' , pe~ pr 

47r 2 r J p 

m 


r^-oo 


e-mr 


(2.52) 
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Figure 2.3. Contour for evaluating propagation amplitude D(x — y) over a 
spacelike interval. 


So again we find that outside the light-cone, the propagation amplitude is 
exponentially vanishing but nonzero. 

To really discuss causality, however, we should ask not whether particles 
can propagate over spacelike intervals, but whether a measurement performed 
at one point can affect a measurement at another point whose separation from 
the first is spacelike. The simplest thing we could try to measure is the field 
4>{x), so we should compute the commutator [<j>(x), (j>(y)\; if this commutator 
vanishes, one measurement cannot affect the other. In fact, if the commu¬ 
tator vanishes for (x — y ) 2 < 0, causality is preserved quite generally, since 
commutators involving any function of <f>(x), including n(x) = d<j>/dt, would 
also have to vanish. Of course we know from Eq. (2.20) that the commutator 
vanishes for ;r° = y°; now let’s do the more general computation: 


|o(jt). o(//)( = J 


d 3 p 1 f (f q 1 


(2tt) 3 y/2 e; J (2tt) 3 y/2 

x [(u p e + ate*'*), (a q e + aj^)] 

— [ dp 1 / ip.( X -y) _ ip.^x—yjS 

J (2tt)3 2 E p [ > 

= D(x-y)-D(y-x). (2.53) 


When (x — y)' 2 < 0, we can perform a Lorentz transformation on the second 
term (since each term is separately Lorentz invariant), taking (x — y) —> 
— (x — y), as shown in Fig. 2.4. The two terms are therefore equal and cancel 
to give zero; causality is preserved. Note that if (x — y) 2 > 0 there is no 
continuous Lorentz transformation that takes (x—y) —> —(x—y). In this case, 
by Eq. (2.51), the amplitude is (fortunately) nonzero, roughly (e~’ mt — e imt ) 
for the special case x — y = 0. Thus we conclude that no measurement in the 
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Figure 2.4. When x — y is spacelike, a continuous Lorentz transformation 
can take (x — y) to —(x — y). 

Klein-Gordon theory can affect another measurement outside the light-cone. 

Causality is maintained in the Klein-Gordon theory just as suggested at 
the end of Section 2.1. To understand this mechanism properly, however, we 
should broaden the context of our discussion to include a complex Klein- 
Gordon field, which has distinct particle and antiparticle excitations. As was 
mentioned in the discussion of Eq. (2.15), we can add a conserved charge to 
the Klein-Gordon theory by considering the field <f>{x) to be complex- rather 
than real-valued. When the complex scalar field theory is quantized (see Prob¬ 
lem 2.2), 0(x) will create positively charged particles and destroy negatively 
charged ones, while (f>Hx) will perform the opposite operations. Then the com¬ 
mutator [(j>{x), ft (y)\ will have nonzero contributions, which must delicately 
cancel outside the light-cone to preserve causality. The two contributions have 
the spacetime interpretation of the two terms in (2.53), but with charges at¬ 
tached. The first term will represent the propagation of a negatively charged 
particle from y to x. The second term will represent the propagation of a 
positively charged particle from x to y. In order for these two processes to 
be present and give canceling amplitudes, both of these particles must exist, 
and they must have the same mass. In quantum field theory, then, causality 
requires that every particle have a corresponding antiparticle with the same 
mass and opposite quantum numbers (in this case electric charge). For the 
real-valued Klein-Gordon field, the particle is its own antiparticle. 

The Klein-Gordon Propagator 

Let us study the commutator [(p{x), (f>(y)\ a little further. Since it is a 
c-number, we can write [<f>(x), <j>(y)] = (0| [<f>{x), <j>{y)\ |0). This can be rewritten 
as a four-dimensional integral as follows, assuming for now that x° > y°: 

(0 [4>(xU(y)] 10 ) = j - " - e ip ' (x ~ y) ) 
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f d 3 p f 1 

J (2^\2£p' 




+ 


-2FL 


ip°—E ] 
,-ip-ix-V) 


p° — — E i 


} 


_ | d p I dp _ 1 ( ,-'>G. (2 54 ) 

*4#° J (2tt)3 J 2iri pi - m? ■ ' 

In the last step the p° integral is to be performed along the following contour: 


For x° > y° we can close the contour below, picking up both poles to obtain 
the previous line of (2.54). For x° < y° we may close the contour above, 
giving zero. Thus the last line of (2.54), together with the prescription for 
going around the poles, is an expression for what we will call 

D r (x - y) = 6(x° - y°) (0| [<j>{x), <j>{y)\ |0). (2.55) 

To understand this quantity better, let’s do another computation: 

(d 2 + jn 2 )D R (x - y) = ( d 2 9{x° - y 0 )) (0| \<j>{x), <j>{y)\ |0) 

+ 2(d„0(x° - y°))(d» (0mx)^(y)}\0)) 

+ 6{x° — y°) (d 2 + m 2 ) (0| [4>{x), 4>{y)] |0) 

= —S(x° - y°) (0| [tt (x),4>(y)\ |0) 

+ 2 S(x° - y°) (0| [ir(x),<j>(y)\ |0) + 0 
= — 'i<5 (4) (;c —y). (2.56) 

This says that D R (x — y) is a Green’s function of the Klein-Gordon operator. 
Since it vanishes for x° < y°, it is the retarded Green’s function. 

If we had not already derived expression (2.54), we could find it by Fourier 
transformation. Writing 

Dr(x - y) = j ^ ;/) />/,■(/')• (2.57) 

we obtain an algebraic expression for Dr(p ): 

( -P 2 + m 2 )D R (p) = -i. 


Thus we immediately arrive at the result 


Dr(x 



d P _ * -ip-{x-y) 

(27t) 4 p 2 — m 2 


(2.58) 
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The p°-integral of (2.58) can be evaluated according to four different con¬ 
tours, of which that used in (2.54) is only one. In Chapter 4 we will find that 
a different pole prescription, 


is extremely useful; it is called the Feynman prescription. A convenient way 
to remember it is to write 

D F (x -y)= f -p- - -V— **>■<*-$, (2.59) 

J (27r) 4 p- — m- + ic 

since the poles are then at p° = ±(E p —ie), displaced properly above and below 
the real axis. When x° > y° we can perform the p° integral by closing the 
contour below, obtaining exactly the propagation amplitude D(x — y) (2.50). 
When x° < y° we close the contour above, obtaining the same expression but 
with x and y interchanged. Thus we have 


Dp(x 


_ ( D(x — y) for x° > y° 

\ D(y — x) for x° < y° 

= 0(x° - y°) <0| <f>(x)cf>(y) |0) + 0(y° - x°) (0| <f>(y)cf>(x) |0) 

= {0\T<Kx)<Kv)\0). (2.60) 


The last line defines the “time-ordering” symbol T, which instructs us to 
place the operators that follow in order with the latest to the left. By applying 
( d 2 +m 2 ) to the last line, you can verify directly that Dp is a Green’s function 
of the Klein-Gordon operator. 

Equations (2.59) and (2.60) are, from a practical point of view, the most 
important results of this chapter. The Green’s function Dp(x — y) is called 
the Feynman propagator for a Klein-Gordon particle, since it is, after all, a 
propagation amplitude. Indeed, the Feynman propagator will turn out to be 
part of the Feynman rules: Dp(x-y) (or Dp(p)) is the expression that we will 
attach to internal lines of Feynman diagrams, representing the propagation of 
virtual particles. 

Nevertheless we are still a long way from being able to do any real calcu¬ 
lations, since so far we have talked only about the free Klein-Gordon theory, 
where the field equation is linear and there are no interactions. Individual par¬ 
ticles live in their isolated modes, oblivious to each others’ existence and to 
the existence of any other species of particles. In such a theory there is no hope 
of making any observations, by scattering or any other means. On the other 
hand, the formalism we have developed is extremely important, since the free 
theory forms the basis for doing perturbative calculations in the interacting 
theory. 
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Particle Creation by a Classical Source 

There is one type of interaction, however, that we are already equipped to 
handle. Consider a Klein-Gordon field coupled to an external, classical source 
field j(x). That is, consider the field equation 

( d 2 + m 2 )(f>(x) = j(x), (2.61) 


where j(x) is some fixed, known function of space and time that is nonzero 
only for a finite time interval. If we start in the vacuum state, what will we 
find after j(x) has been turned on and off again? 

The field equation (2.61) follows from the Lagrangian 

£ = |(<9 iU d>) 2 - \m 2 <t> 2 + j(x)<fi(x). (2.62) 


But if j(x) is turned on for only a finite time, it is easiest to solve the problem 
using the field equation directly. Before j(x) is turned on, oix) has the form 


4>o(x) = 


J 


d 3 p 


1 


(2tt) 3 y 2E ,p 


Ojr\€- 


+ a p e 


t Jp-x 


)■ 


If there were no source, this would be the solution for all time. With a source, 
the solution of the equation of motion can be constructed using the retarded 
Green’s function: 


0{x) = <t> o(x) + i j d A y D r (x - y)j(y) 




X (e -ipdx-v) _ e ip-(.r-y 


] )j(y)- (2.63) 


If we wait until all of j is in the past, the theta function equals 1 in the whole 
domain of integration. Then <fi(x) involves only the Fourier transform of j, 



d 4 ye ,p ' v j(y), 


evaluated at 4-momenta p such that p 2 = m 2 . It is natural to group the 
positive-frequency terms together with a v and the negative-frequency terms 
with this yields the expression 



d 3 p 1 
(2tt) 3 n /2E; 


CLr\ + 




--j('P)) < 


+ h.c.j. 


(2.64) 


You can now guess (or compute) the form of the Hamiltonian after j (x) 
has acted: Just replace a v with (a p -f ij(p )/ \f2E p ) to obtain 




(rt) (■ 


ftp + 
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The energy of the system after the source has been turned off is 

(0\H\0) = J-^±\j(p)\^ (2.65) 

where |0) still denotes the ground state of the free theory. We can interpret 
these results in terms of particles by identifying \j(p)\ 2 /2E p as the probability 
density for creating a particle in the mode p. Then the total number of particles 
produced is 

f dN = {2M) 

Only those Fourier components of j(x) that are in resonance with on-mass- 
shell (i.e., p 2 = m 2 ) Klein-Gordon waves are effective at creating particles. 

We will return to this subject in Problem 4.1. In Chapter 6 we will study 
the analogous problem of photon creation by an accelerated electron (brems- 
strahlung). 


Problems 


2.1 Classical electromagnetism (with no sources) follows from tlie action 


S = 





where = d^Av - 


( a ) 

(b) 

is an equally good energy-momentum tensor with the same globally conserved 
energy and momentum. Show that this construction, with 

pX/iv _ 

leads to an energy-momentum tensor T that is symmetric and yields the standard 
formulae for the electromagnetic energy and momentum densities: 

£ = 4(F 2 + B 2 ); S = E x B. 


Derive Maxwell’s equations as the Euler-Lagrange equations of this action, treat¬ 
ing the components A fl (x) as the dynamical variables. Write the equations in 
standard form by identifying F* = -F oi and £> k B k = -F ij . 

Construct the energy-momentum tensor for this theory. Note that the usual 
procedure does not result in a symmetric tensor. To remedy that, we can add to 
a term of the form d\K 2 '^ 1 ’, where is antisymmetric in its first two 

indices. Such an object is automatically divergenceless, so 

jtfw _ rpnv g x pX/jiy 


2.2 The complex scalar field. Consider the field theory of a complex-valued scalar 
field obeying the Klein-Gordon equation. The action of this theory is 
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It is easiest to analyze this theory by considering <j>(x) and (t>*{x), rather than the real 
and imaginary parts of <j>(x), as the basic dynamical variables. 

(a) Find the conjugate momenta to <j>(x) and <j>*(x) and the canonical commutation 
relations. Show that the Hamiltonian is 

H = Jd 3 x(n*Tr + ¥<{>* • V0 + ro 2 <g*c!>). 

Compute the Heisenberg equation of motion for <f>(x) and show that it is indeed 
the Klein-Gordon equation. 

(b) Diagonalize H by introducing creation and annihilation operators. Show that 
the theory contains two sets of particles of mass m. 

(c) Rewrite the conserved charge 

Q = J d 3 x ^(0*7r* — 7 T(t>) 

in terms of creation and annihilation operators, and evaluate the charge of the 
particles of each type. 

(d) Consider the case of two complex Klein-Gordon fields with the same mass. Label 
the fields as <p a (x), where a = 1,2. Show that there are now four conserved 
charges, one given by the generalization of part (c), and the other three given 

by 

< 9 * = Jd 3 x l -{<l)* a {a l ) ab Tr* b - TT a (cr‘)ab<Pb), 

where <7* are the Pauli sigma matrices. Show that these three charges have the 
commutation relations of angular momentum (SU( 2)). Generalize these results 
to the case of n identical complex scalar fields. 

2.3 Evaluate the function 

< 0 | <i>(x)<b(y) | 0 ) = D(x - y) = J ^3 e“* p - ( - r_y) , 

for (x — y) spacelike so that (x — y) 2 = —r 2 , explicitly in terms of Bessel functions. 
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The Dirac Field 


Having exhaustively treated the simplest relativistic field equation, we now 
move on to the second simplest, the Dirac equation. You may already be 
familiar with the Dirac equation in its original incarnation, that is, as a single¬ 
particle quantum-mechanical wave equation.* In this chapter our viewpoint 
will be quite different. First we will rederive the Dirac equation as a classical 
relativistic field equation, with special emphasis on its relativistic invariance. 
Then, in Section 3.5, we will quantize the Dirac field in a manner similar to 
that used for the Klein-Gordon field. 


3.1 Lorentz Invariance in Wave Equations 

First we must address a question that we swept over in Chapter 2: What do 
we mean when we say that an equation is “relativistically invariant”? A rea¬ 
sonable definition is the following: If <j> is a field or collection of fields and V 
is some differential operator, then the statement “"Dp = 0 is relativistically 
invariant” means that if <j>(x) satisfies this equation, and we perform a rota¬ 
tion or boost to a different frame of reference, then the transformed field, in 
the new frame of reference, satisfies the same equation. Equivalently, we can 
imagine physically rotating or boosting all particles or fields by a common 
angle or velocity; again, the equation Vcp = 0 should be true after the trans¬ 
formation. We will adopt this “active” point of view toward transformations 
in the following analysis. 

The Lagrangian formulation of field theory makes it especially easy to 
discuss Lorentz invariance. An equation of motion is automatically Lorentz 
invariant by the above definition if it follows from a Lagrangian that is a 
Lorentz scalar. This is an immediate consequence of the principle of least 
action: If boosts leave the Lagrangian unchanged, the boost of an extremum 
in the action will be another extremum. 


*This subject is covered, for example, in Sclriff (1968), Chapter 13; Baym (1969), 
Chapter 23; Sakurai (1967), Chapter 3. Although the present chapter is self-contained, 
we recommend that you also study the single-particle Dirac equation at some point. 
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As an example, consider the Klein-Gordon theory. We can write an arbi¬ 
trary Lorentz transformation as 

x>‘ -> x'>‘ = AW, (3.1) 

for some 4x4 matrix A. What happens to the Klein-Gordon field <f>{x) under 
this transformation? Think of the field <p as measuring the local value of some 
quantity that is distributed through space. If there is an accumulation of this 
quantity at x = xq, 4>(x) will have a maximum at xo■ If we now transform the 
original distribution by a boost, the new distribution will have a maximum at 
x = Axo. This is illustrated in Fig. 3.1(a). The corresponding transformation 
of the field is 

<f>(x) —> <f>'(x) = <f>{ A -1 re). (3.2) 

That is, the transformed field, evaluated at the boosted point, gives the same 
value as the original field evaluated at the point before boosting. 

We should check that this transformation leaves the form of the Klein- 
Gordon Lagrangian unchanged. According to (3.2), the mass term -f m 2 (jr{x ) 
is simply shifted to the point (A -1 a;). The transformation of d^cp(x) is 

d fl (f){x) -» <9,, (d>(A -1 ;r)) = (A -1 )■ (d v <j>){ A -1 x). (3.3) 

Since the metric tensor gV v is Lorentz invariant, the matrices A -1 obey the 
identity 

(A-y M (A-y„<r = <r. (3.4) 

Using this relation, we can compute the transformation law of the kinetic term 
of the Klein-Gordon Lagrangian: 

{d^(x)) 2 -» g^(d^’(x)) {d v <t>'(x)) 

= [(A- 1 )"^^] [(A- 1 )-^^] (A _1 ;r) 

= g pa {d P 4>){d a 4>)(A- 1 x) 

= «V>) 2 (A 't). 

Thus, the whole Lagrangian is simply transformed as a scalar: 

jC(x) —> C{ A _1 ;c). (3.5) 

The action S, formed by integrating £ over spacetime, is Lorentz invariant. 
A similar calculation shows that the equation of motion is invariant: 

(d 2 +m 2 )<P'(x) = [(A- 1 y it d v (AT 1 y' l d a +m 2 ] d>(A _1 ;c) 

= {g ua d„d a + m 2 )(p(A~ 1 x) 

= 0 . 

The transformation law (3.2) used for <f> is the simplest possible transfor¬ 
mation law for a field. It is the only possibility for a field that has just one 
component. But we know examples of multiple-component fields that trans¬ 
form in more complicated ways. The most familiar case is that of a vector field, 
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Figure 3.1. When a rotation is performed on a vector field, it affects the 
orientation of the vector as well as the location of the region containing the 
configuration. 

such as the 4-current density j ,l (x) or the vector potential A f ‘(x). In this case, 
the quantity that is distributed in spacetime also carries an orientation, which 
must be rotated or boosted. As shown in Fig. 3.1(b), the orientation must be 
rotated forward as the point of evaluation of the field is changed: 

under 3-dimensional rotations, V'(x) —1 (FT 1 ;c); 

under Lorentz transformations, V^(x) —»• bJ x v V v (A -1 x). 

Tensors of arbitrary rank can be built out of vectors by adding more indices, 
with correspondingly more factors of A in the transformation law. Using such 
vector and tensor fields we can write a variety of Lorentz-invariant equations, 
for example, Maxwell’s equations, 

d u F ilv = 0 or d 2 A v - dvd^Af, = 0, (3.6) 

which follow from the Lagrangian 

^Maxwell = = -\(d„A v ~ d v A,f- (3.7) 

In general, any equation in which each term has the same set of uncontracted 
Lorentz indices will naturally be invariant under Lorentz transformations. 

This method of tensor notation yields a large class of Lorentz-invariant 
equations, but it turns out that there are still more. How do we find them? 
We could try to systematically find all possible transformation laws for a field. 
Then it would not be hard to write invariant Lagrangians. For simplicity, we 
will restrict our attention to linear transformations, so that, if 4> a is an n 
component multiplet, the Lorentz transformation law is given by an n x n 
matrix M( A): 


*„(*) ->■ Mab (A) $6 (A -1 x). 


(3.8) 
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It can be shown that the most general nonlinear transformation laws can be 
built from these linear transformations, so there is no advantage in considering 
transformations more general than (3.8). In the following discussion, we will 
suppress the change in the field argument and write the transformation (3.8) 
in the form 

$ ->■ M( A)$. (3.9) 

What are the possible allowed forms for the matrices M( A)? The basic 
restriction on M( A) is found by imagining two successive transformations, A 
and A'. The net result must be a new Lorentz transformation A"; that is, 
the Lorentz transformations form a group. This gives a consistency condition 
that must be satisfied by the matrices M( A): Under the sequence of two 
transformations, 

$ ->■ M(A')M(A)$ = M( A")$, (3.10) 

for A" = A'A. Thus the correspondence between the matrices M and the 
transformations A must be preserved under multiplication. In mathematical 
language, we say that the matrices M must form an n-dimensional represen¬ 
tation of the Lorentz group. So our question now is rephrased in mathemati¬ 
cal language: What are the (finite-dimensional) matrix representations of the 
Lorentz group? 

Before answering this question for the Lorentz group, let us consider a sim¬ 
pler group, the rotation group in three dimensions. This group has representa¬ 
tions of every dimensionality n, familiar in quantum mechanics as the matrices 
that rotate the /r-component wavefunctions of particles of different spins. The 
dimensionality is related to the spin quantum number s by n = 2s + 1. The 
most important nontrivial representation is the two-dimensional representa¬ 
tion, corresponding to spin 1/2. The matrices of this representation are the 
2x2 unitary matrices with determinant 1, which can be expressed as 

U = e - ieia '/ 2 , (3.11) 

where 0 l are three arbitrary parameters and a % are the Pauli sigma matrices. 

For any continuous group, the transformations that lie infinitesimally close 
to the identity define a vector space, called the Lie algebra of the group. 
The basis vectors for this vector space are called the generators of the Lie 
algebra, or of the group. For the rotation group, the generators are the angular 
momentum operators J % , which satisfy the commutation relations 

[J i ,J j ]=ie iik J k . (3.12) 

The finite rotation operations are formed by exponentiating these operators: 
In quantum mechanics, the operator 

/,’ : exp | iO'.V (3.13) 

gives the rotation by an angle |0| about the axis 0. The commutation rela¬ 
tions of the operators J' determine the multiplication laws of these rotation 
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operators. Thus, a set of matrices satisfying the commutation relations (3.12) 
produces, through exponentiation as in (3.13), a representation of the rotation 
group. In the example given in the previous paragraph, the representation of 
the angular momentum operators 

■r -)• y (3.14) 

produces the representation of the rotation group given in Eq. (3.11). It is 
generally true that one can find matrix representations of a continuous group 
by finding matrix representations of the generators of the group (which must 
satisfy the proper commutation relations), then exponentiating these infinites¬ 
imal transformations. 

For our present problem, we need to know the commutation relations 
of the generators of the group of Lorentz transformations. For the rotation 
group, one can work out the commutation relations by writing the generators 
as differential operators; from the expression 

J = xxp = xx (—IV), (3.15) 

the angular momentum commutation relations (3.12) follow straightforwardly. 
The use of the cross product in (3.15) is special to the case of three dimensions. 
However, we can also write the operators as an antisymmetric tensor, 

= /(./•'V J - .HV' l. 

so that J 3 = J 12 and so on. The generalization to four-dimensional Lorentz 
transformations is now quite natural: 

J"" = iW'd v - x v d >l ). (3.16) 

We will soon see that these six operators generate the three boosts and three 
rotations of the Lorentz group. 

To determine the commutation rules of the Lorentz algebra, we can now 
simply compute the commutators of the differential operators (3.16). The 
result is 


[j"", J pa ] = i{ g up jp a - c/ p r a - (f a jp p + c/ a r p ) . (3.i7) 


Any matrices that are to represent this algebra must obey these same com¬ 
mutation rules. 

Just to see that we have this right, let us look at one particular represen¬ 
tation (which we will simply pull out of a hat). Consider the 4x4 matrices 


/(.)",,.r r : rt"a) 1 ;,). (3.18) 

(Here p. and v label which of the six matrices we want, while a and 3 la¬ 
bel components of the matrices.) You can easily verify that these matrices 
satisfy the commutation relations (3.17). In fact, they are nothing but the 
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matrices that act on ordinary Lorentz 4-vectors. To see this, parametrize an 
infinitesimal transformation as follows: 


V a -► {s a 0 - ^AJn a 0)v p , 


(3.19) 


where V is a 4-vector and an antisymmetric tensor, gives the infinites¬ 
imal angles. For example, consider the case u>ri = —^21 = 8, with all other 
components of a j equal to zero. Then Eq. (3.19) becomes 


V -> 



0 

1 

8 

0 


0 

-8 

1 

0 


°\ 

0 

0 

1 / 


(3.20) 


which is just an infinitesimal rotation in the ;ry-plane. You can also verify 
that setting (j 0 i = — wio = P gives 


V ->■ 


° 

Vo 


P 

1 

0 

0 


0 

0 

1 

0 



(3.21) 


an infinitesimal boost in the ;c-direction. The other components of 1 0 generate 
the remaining boosts and rotations in a similar manner. 


3.2 The Dirac Equation 

Now that we have seen one finite-dimensional representation of the Lorentz 
group, the logical next step would be to develop the formalism for finding 
all other representations. Although this is not very difficult to do (see Prob¬ 
lem 3.1), it is hardly necessary for our purposes, since we are mainly interested 
in the representation(s) corresponding to spin 1/2. 

We can find such a representation using a trick due to Dirac: Suppose 
that we had a set of four n x n matrices 7'' satisfying the anticommutation 
relations 

{YPY} = YY + Yl" = 2 g 1 ' 1 ' X 1 n x n (Dirac algebra). (3.22) 

Then we could immediately write down an n-dimensional representation of 
the Lorentz algebra. Here it is: 

S^=\[ Y,Y]- (3-23) 

By repeated use of (3.22), it is easy to verify that these matrices satisfy the 
commutation relations (3.17). 

This computation goes through in any dimensionality, with Lorentz or 
Euclidean metric. In particular, it should work in three-dimensional Euclidean 
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space, and in fact we can simply write 

(Pauli sigma matrices), 
so that {7*,7 J } = —2<W. 

The factor of i in the first line and the minus sign in the second line are purely 
conventional. The matrices representing the Lorentz algebra are then 

S ij = V J 'V, (3.24) 

which we recognize as the two-dimensional representation of the rotation 
group. 

Now let us find Dirac matrices for four-dimensional Minkowski space. 
It turns out that these matrices must be at least 4x4. (There is no fourth 
2x2 matrix, for example, that anticommutes with the three Pauli sigma 
matrices.) Further, all 4x4 representations of the Dirac algebra are unitarily 
equivalent. * We thus need only write one explicit realization of the Dirac 
algebra. One representation, in 2 x 2 block form, is 



(3.25) 


This representation is called the Weyl or chiral representation. We will find 
it an especially convenient choice, and we will use it exclusively throughout 
this book. (Be careful, however, since many field theory textbooks choose a 
different representation, in which y° is diagonal. Furthermore, books that use 
chiral representations often make a different choice of sign conventions.) 

In our representation, the boost and rotation generators are 


S 0i = 




(3.26) 


and 






(3.27) 


A four-component field ip that transforms under boosts and rotations accord¬ 
ing to (3.26) and (3.27) is called a Dirac spinor. Note that the rotation gen¬ 
erator S’-> is just the three-dimensional spinor transformation matrix (3.24) 
replicated twice. The boost generators S 0t are not Hermitian, and thus our 
implementation of boosts is not unitary (this was also true of the vector rep¬ 
resentation (3.18)). In fact the Lorentz group, being “noncompact”, has no 
faithful, finite-dimensional representations that are unitary. But that does not 
matter to us, since tp is not a wavefunction; it is a classical field. 


iTliis statement and tlie preceding one follow from the general theory of the 
representations of the Lorentz group derived in Problem 3.1. 
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Now that we have the transformation law for ip, we should look for an 
appropriate field equation. One possibility is simply the Klein-Gordon equa¬ 
tion: 

(d 2 +m 2 )xp = 0. (3.28) 

This works because the spinor transformation matrices (3.26) and (3.27) op¬ 
erate only in the "internal” space; they go right through the differential oper¬ 
ator. But it is possible to write a stronger, first-order equation, which implies 
(3.28) but contains additional information. To do this we need to know one 
more property of the 7 matrices. With a short computation you can verify 
that 

[Y\s p °] = (j pa yw, 

or equivalently, 

(1 + ^ ptT S ^) 7 "(1 - iuvS") = (1 - iw ’ P aJ pa )W- 

This equation is just the infinitesimal form of 

AlV'Ai = AVA (3.29) 

where 

Ai = exp(-^-u^ v S 111 ') (3.30) 

2 Z 

is the spinor representation of the Lorentz transformation A (compare (3.19)). 
Equation (3.29) says that the 7 matrices are invariant under simultaneous 
rotations of their vector and spinor indices (just like the a % under spatial 
rotations). In other words, we can “take the vector index p on 7 '' seriously,” 
and dot 7^ into d [t to form a Lorentz-invariant differential operator. 

We are now ready to write down the Dirac equation. Here it is: 

— m)xp(x) = 0. (3.31) 

To show that it is Lorentz invariant, write down the Lorentz-transformed 
version of the left-hand side and calculate: 

[iYd^ -m\ip(x) -> [iY‘{A~ 1 y fl d^ -m]A|^(A"V) 

= A 1 Al 1 [* 7 M (A “ 1 ) 'j, d v — m] AiD(A -1 ;e) 

= Ai [iNrt^Ki{K- 1 ) v ll d v - m\ip( A _ 1 x) 

= Ai [zA'' o- 7 (A — 1 )^<9^ - m\ip( A -1 re) 

= A 1 \jYdv — m]tp( A _ 1 .t) 

= 0. 
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To see that the Dirac equation implies the Klein-Gordon equation, act on the 
left with (— iy dij — m): 

0 = — m)ip 

= (rYd,A + m 2 )D 

= ( d 2 + m 2 )ib. 

To write down a Lagrangian for the Dirac theory, we must figure out how 
to multiply two Dirac spinors to form a Lorentz scalar. The obvious guess, 
ipty, does not work. Under a Lorentz boost this becomes ip^Ati Aiip; if the 
boost matrix were unitary, we would have A] = A / and everything would be 
fine. But Ai is not unitary, because the generators (3.26) are not Hermitian. 
The solution is to define 

ip = ip^ 7 0 . (3.32) 

Under an infinitesimal Lorentz transformation parametrized by we have 
ip — > yA( 1 + . The sum over p, and v has six distinct nonzero 

terms. In the rotation terms, where // and v are both nonzero, = S 

and S 111 ' commutes with 7 0 . In the boost terms, where p or v is 0, (S^"^ = 
but Sv" anticommutes with 7 0 . Passing the 7 0 to the left therefore 
removes the dagger from , yielding the transformation law 

ip ->■ DA] 1 , (3.33) 

and therefore the quantity xpxp is a Lorentz scalar. Similarly you can show 
(with the aid of (3.29)) that ipj^ip is a Lorentz vector. 

The correct, Lorentz-invariant Dirac Lagrangian is therefore 

£ D irac = D('i7 , ‘9 ;x - m)ip. (3.34) 

The Euler-Lagrange equation for ip (or ?/A) immediately yields the Dirac equa¬ 
tion in the form (3.31); the Euler-Lagrange equation for ip gives the same 
equation, in Hermitian-conjugate form: 

—id^ipj^ — imp = 0. (3.35) 


Weyl Spinors 


^From the block-diagonal form of the generators (3.26) and (3.27), it is appar¬ 
ent that the Dirac representation of the Lorentz group is reducibleP We can 
form two 2-dimensional representations by considering each block separately, 
and writing 



(3.36) 


+If we had used a different representation of the gamma matrices, the reducibility 
would not be manifest; this is essentially the reason for using the chiral representation. 
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The two-component objects ipL and ipR are called left-handed and right- 
handed Weyl spinors. You can easily verify that their transformation laws, 
under infinitesimal rotations 0 and boosts /3, are 


tp L ->• (1 - id ■ f - f3 ■ $)ipf, 

fpR -t (1 - iO ■ I + /3 • f )ipR. 

These transformation laws are connected by complex conjugation; using the 
identity 

crV = -era 2 , (3.38) 


it is not hard to show that the quantity a 2 ^ transforms like a right-handed 
spinor. 

In terms of ipL and ipR, the Dirac equation is 


- m)ip 


f -TO i(d 0 + <T • V) \ f i/tA _ 0 

\i(d 0 -cr-'V) -to J \iPrJ 


(3.39) 


The two Lorentz group representations %pl and ipR are mixed by the mass 
term in the Dirac equation. But if we set to = 0, the equations for %/)r and ipR 
decouple: 


i(d 0 - cr ■ V)V>l = 0; 
i(d 0 + cr ■ V)ip R = 0. 


(3.40) 


These are called the Weyl equations ; they are especially important when treat¬ 
ing neutrinos and the theory of weak interactions. 

It is possible to clean up this notation slightly. Define 


cr^ = (1, cr), cr" = (l,-<r), (3.41) 


so that 



(3.42) 


(The bar on cr has absolutely nothing to do with the bar on ip.) Then the 
Dirac equation can be written 


f -m ia -d\ (iPl\ _ „ 
-to ) \iPr) 


(3.43) 


and the Weyl equations become 

ia ■ dipL = 0; 


ia ■ dipR = 0. 


(3.44) 
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3.3 Free-Particle Solutions of the Dirac Equation 

To get some feel for the physics of the Dirac equation, let us now discuss its 
plane-wave solutions. Since a Dirac field ip obeys the Klein-Gordon equation, 
we know immediately that it can be written as a linear combination of plane 
waves: 

'.'(./•) «(/')<■’ ' , " x . where p 2 =m 2 . (3.45) 


For the moment we will concentrate on solutions with positive frequency, that 
is, p° > 0. The column vector u(p) must obey an additional constraint, found 
by plugging (3.45) into the Dirac equation: 

iYpn ~ m)u(p) = 0. (3.46) 

It is easiest to analyze this equation in the rest frame, where p = po = {m, 0); 
the solution for general p can then be found by boosting with Ai. In the rest 
frame, Eq. (3.46) becomes 

(7717° - m)u(po) = m( J _ j 'j u(p 0 ) = 0, 


and the solutions are 



for any numerical two-component spinor £. We conventionally normalize £ so 
that = 1; the factor i Jm has been inserted for future convenience. We can 
interpret the spinor £ by looking at the rotation generator (3.27): £ transforms 
under rotations as an ordinary two-component spinor of the rotation group, 
and therefore determines the spin orientation of the Dirac solution in the 
usual way. For example, when £ = (*), the particle has spin up along the 
3-direction. 

Notice that after applying the Dirac equation, we are free to choose only 
two of the four components of u(p). This is just what we want, since a spin-1/2 
particle has only two physical states—spin up and spin down. (Of course we 
are being a bit premature in talking about particles and spin. We will prove 
that the spin angular momentum of a Dirac particle is Ti/2 when we quantize 
the Dirac theory in Section 3.5; for now, just notice that there are two possible 
solutions u(p) for any momentum p.) 

Now that we have the general form of u(p) in the rest frame, we can obtain 
u(p) in any other frame by boosting. Consider a boost along the 3-direction. 
First we should remind ourselves of what the boost does to the 4-momentum 
vector. In infinitesimal form, 
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where ?y is some infinitesimal parameter. For finite ij we must write 



= exp 


>1 





4- sinh i] 



f m cosh i] 
\m sinh /y 


(3.48) 


The parameter /y is called the rapidity. It is the quantity that is additive under 
successive boosts. 

Now apply the same boost to u(p). According to Eqs. (3.26) and (3.30), 


u(p) = exp — 

= cosh(A/y) 


cr 3 0 
0 -cr 3 


1 0 
0 


1 j ~ sinh(i»j) ( CT C 
_ /e«/ 2 (^)+e-^(i^} 


r s o 

0 -cr 3 


0 


^( 1 ^ 1 )+^ ( 1 ^ 1 ) 


\s/E^{±±f) + 

The last line can be simplified to give 

, \ (\fp-p£ 

uip) = 



(3.49) 


(3.50) 


where it is understood that in taking the square root of a matrix, we take 
the positive root of each eigenvalue. This expression for u(p) is not only more 
compact, but is also valid for an arbitrary direction of p. When working with 
expressions of this form, it is often useful to know the identity 


(p • cr) (p ■ cr) = p 2 = m 2 . 


(3.51) 


You can then verify directly that (3.50) is a solution of the Dirac equation in 
the form of (3.43). 

In practice it is often convenient to work with specific spinors £. A useful 
choice here would be eigenstates of cr 3 . For example, if £ = Q (spin up along 
the 3-axis), we get 


u(p) 


KJeTWOJ 



(3.52) 
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while for £ = (°) (spin down along the 3-axis) we have 


u(p) 


\\JE P 3 (j) J large boost \ 0 / 


(3.53) 


In the limit p —> oo the states degenerate into the two-component spinors of 
a massless particle. (We now see the reason for the factor of \/to in (3.47): It 
keeps the spinor expressions finite in the massless limit.) 

The solutions (3.52) and (3.53) are eigenstates of the helicity operator, 


h = p- S = -pi 




(3.54) 


A particle with h = +1/2 is called right-handed, while one with h = —1/2 is 
called left-handed. The helicity of a massive particle depends on the frame of 
reference, since one can always boost to a frame in which its momentum is 
in the opposite direction (but its spin is unchanged). For a massless particle, 
which travels at the speed of light, one cannot perform such a boost. 

The extremely simple form of u(p) for a massless particle in a helicity- 
eigenstate makes the behavior of such a particle easy to understand. In Chap¬ 
ter 1, it enabled us to guess the form of the e + e _ —>■ p + p~ cross section in the 
massless limit. In subsequent chapters we will often do a mindless calculation 
first, then look at helicity eigenstates in the high-energy limit to understand 
what we have done. 

Incidentally, we are now ready to understand the origin of the notation 
ipL and ipR for Weyl spinors. The solutions of the Weyl equations are states of 
definite helicity, corresponding to left- and right-handed particles, respectively. 
The Lorentz invariance of helicity (for a massless particle) is manifest in the 
notation of Weyl spinors, since ipL and tpR live in different representations of 
the Lorentz group. 

It is convenient to write the normalization condition for u(p) in a Lorentz- 
invariant way. We saw above that is not Lorentz invariant. Similarly, 

vtu = (^Vp ' 04 f^Vp ' 

K 1 KVp^U (3.55) 

= 2£p£ f e 

To make a Lorentz scalar we define 


u(p) = u^(p) 7°. (3.56) 

Then by an almost identical calculation, 

uu = 2 (3.57) 

This will be our normalization condition, once we also require that the two- 
component spinor £ be normalized as usual: = 1. It is also conventional to 

choose basis spinors and £ 2 (such as Q and ( < j ) )) that are orthogonal. For 
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a massless particle Eq. (3.57) is trivial, so we must write the normalization 
condition in the form of (3.55). 

Let us summarize our discussion so far. The general solution of the Dirac 
equation can be written as a linear combination of plane waves. The positive- 
frequency waves are of the form 

ib(x) = u(p)t p 2 = m 2 , p° > 0. (3.58) 

There are two linearly independent solutions for u(p), 

U s {p)=(^- 2^), s = l,2 (3.59) 

which we normalize according to 

u r (p)u s (p) = 2 mS rs or u r ^(p)u s {p) = 2 E p S rs . (3.60) 

In exactly the same way, we can find the negative-frequency solutions: 

ip(x) = v(p)e +ip ’ x , p 2 = m 2 , p° > 0. (3.61) 

(Note that we have chosen to put the + sign into the exponential, rather than 
having p° < 0.) There are two linearly independent solutions for v(p), 

#(,)-('^2?'.), s — 1.2 (3.62) 

\~VP ■or) 8 ) 

where if is another basis of two-component spinors. These solutions are nor¬ 
malized according to 

v r (p)v s (p) =-2mS rs or v r] {p)v s {p) =+2E v S rs . (3.63) 

The u’s and v’s are also orthogonal to each other: 

u r (p)v s (p) = v r (p)u s (p) = 0. (3.64) 

Be careful, since u r ^(p)v s {p) f 0 and v r Hp)u s {p) f 0. However, note that 

u rt ( p)u s (-p) = i> rt (-p)u a (p) = 0, (3.65) 

where we have changed the sign of the 3-momentum in one factor of each 
spinor product. 

Spin Sums 

In evaluating Feynman diagrams, we will often wish to sum over the polar¬ 
ization states of a fermion. We can derive the relevant completeness relations 
with a simple calculation: 

uS (p)u s (p) = (^|- fe ) ^Vp 7 ^) 

\Jp ■ Osjp ■ a sjp ■ osjp ■ a 
\Jp ■ Osjp ■ a sjp ■ Osjp ■ a 
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f m p ■ a 
\p - a m 


In the second line we have used 


E ^ st 




Thus we arrive at the desired formula, 


E u s (p)u s (p ) = 7 ■ p + m. 

S 


Similarly, 


E v s (p)v s (p ) = 7 • p — m. 

5 


(3.66) 


(3.67) 


The combination y-p occurs so often that Feynman introduced the notation 
y = 7 , 'p /J . We will use this notation frequently from now on. 


3.4 Dirac Matrices and Dirac Field Bilinears 

We saw in Section 3.2 that the quantity ipip is a Lorentz scalar. It is also 
easy to show that is a 4-vector—we used this fact in writing down the 

Dirac Lagrangian (3.34). Now let us ask a more general question: Consider the 
expression where F is any 4x4 constant matrix. Can we decompose this 
expression into terms that have definite transformation properties under the 
Lorentz group? The answer is yes, if we write T in terms of the following basis 
of sixteen 4x4 matrices, defined as antisymmetric combinations of y-matrices: 


1 1 of these 

y /J 4 of these 

Y 1 ' = i[y", y"] = 7 [ ''7" ] = 6 of these 

Y vp = y^'y'V 1 4 of these 

Y vpa = ^ l Yl p l a] 1 of these 


16 total 

The Lorentz-transformation properties of these matrices are easy to deter¬ 
mine. For example, 

$E> (V> A t) ( A E) 

= E( A i'"Ai.\iy''A' - A'!m\\, Aiy"A±)r 

= A^A^E "V- 

Each set of matrices transforms as an antisymmetric tensor of successively 
higher rank. 
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The last two sets of matrices can be simplified by introducing an addi¬ 
tional gamma matrix, 


7 ° = | 7 ° 7 1 7 2 7 3 


i 

4! 


e tivpv 


luHvlplv 


(3.68) 


Then 7 pvpa = —it pvpcT 7 s and 7 ' wp = —ie f,, ' pa %j 5 . The matrix 7 s has the 
following properties, all of which can be verified using (3.68) and the anti¬ 
commutation relations (3.22): 


(7 5 ) f =7 5 ; 

(3.69) 

(7 5 ) 2 = 1; 

(3.70) 

{7 5 ,7 m } = 0. 

(3.71) 


This last property implies that ['f-,S pl '] = 0. Thus the Dirac representation 
must be reducible, since eigenvectors of 7 s whose eigenvalues are different 
transform without mixing (this criterion for reducibility is known as Schur’s 
lemma). In our basis, 


7 


5 



(3.72) 


in block-diagonal form. So a Dirac spinor with only left- (right-) handed com¬ 
ponents is an eigenstate of 7 s with eigenvalue —1 ( + 1 ), and indeed these 
spinors do transform without mixing, as we saw explicitly in Section 3.2. 

Let us now rewrite our table of 4x4 matrices, and introduce some standard 
terminology: 

1 

7'' 

^ = I h\i v ] 

7 5 

16 


scalar 1 

vector 4 

tensor 6 

pseudo-vector 4 

pseudo-scalar 1 


The terms pseudo-vector and pseudo-scalar arise from the fact that these 
quantities transform as a vector and scalar, respectively, under continuous 
Lorentz transformations, but with an additional sign change under parity 
transformations (as we will discuss in Section 3.6). 

^From the vector and pseudo-vector matrices we can form two currents 
out of Dirac field bilinears: 

j p (x) = tp{x)y ip{x); f°(x) = ip{x)~f p ~f 0, ip{x). (3.73) 

Let us compute the divergences of these currents, assuming that fi satisfies 
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the Dirac equation: 

= (cV/’b'V + '>PY t d, l ip 

= + fj)(—irm/j) (3.74) 

= 0 . 

Thus is always conserved if tp(x) satisfies the Dirac equation. When we 
couple the Dirac field to the electromagnetic field, will become the electric 
current density. Similarly, one can compute 

d^j 115 = 2imip / y° i/;. (3.75) 

If m = 0, this current (often called the axial vector current) is also conserved. 
It is then useful to form the linear combinations 

3l = vy 1 {:] - 7()] 

When m = 0, these are the electric current densities of left-handed and right- 
handed particles, respectively, and are separately conserved. 

The two currents j 11 (x) and j M ° (;r) are the Noether currents corresponding 
to the two transformations 

ip(x) —> c ia c(x) and -> e iai \p{x). 

The first of these is a symmetry of the Dirac Lagrangian (3.34). The second, 
called a chiral transformation , is a symmetry of the derivative term in C but 
not the mass term; thus, Noether’s theorem confirms that the axial vector 
current is conserved only if m = 0. 

Products of Dirac bilinears obey interchange relations, known as Fierz 
identities. We will discuss only the simplest of these, which will be needed 
several times later in the book. This simplest identity is most easily written 
in terms of the two-component Weyl spinors introduced in Eq. (3.36). 

The core of the relation is the identity for the 2x2 matrices a 11 defined 
in Eq. (3.41): 

{y )a0{@ = ^^a^^l35- ( 3 . 77 ) 

(Here a, (3, etc. are spinor indices, and e is the antisymmetric symbol.) One 
can understand this relation by noting that the indices a, 7 transform in the 
Lorentz representation of tpL, while / 3 , S transform in the separate representa¬ 
tion of ipR, and the whole quantity must be a Lorentz invariant. Alternatively, 
one can just verify the 16 components of (3.77) explicitly. 

By sandwiching identity (3.77) between the right-handed portions (i.e., 
lower half) of Dirac spinors u\, U 2 , w.3, u. 4, we find the identity 

{U\R(T >1 U-2r){uzR(T i _ 1 U4.r) = 2e a7 UiR a U 3 R 7 egsU2R3U4Rd 

(3.78) 

= —(UlRCT^U4 r){u 3 r(J ijUor) ■ 

This nontrivial relation says that the product of bilinears in (3.78) is anti¬ 
symmetric under the interchange of the labels 2 and 4, and also under the 
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interchange of 1 and 3. Identity (3.77) also holds for , and so we also find 

(ui L a 11 u 2 l){uzlo ,,u 4 l) = -(uiL& f ‘ U4 l)(u 3L v hUol)- (3.79) 

It is sometimes useful to combine the Fierz identity (3.78) with the iden¬ 
tity linking < 7 M and a 1 ' : 

= {o ljT ) a0 e 01 . (3.80) 

This relation is also straightforward to verify explicitly. By the use of (3.80), 
(3.79), and the relation 

= 4, (3.81) 

we can, for example, simplify horrible products of bilinears such as 
(uiLV 1 ' v"& X U2 l)(U3L&\U4_l) = ^ajUlLaUf^effS C X U2 l)$ {pv O\U4 l)s 

= 2ea~UlLaU3L~ i epSU-2Lf3{& X & , '&v<7\U4L)5 

= 2 • (4)“ • e al UiL a U3L~,tj35U2Lj3U4L6 

= 1 &{uil^‘u-2l){U3LO ^U4l) ■ (3.82) 

There are also Fierz rearrangement identities for 4-component Dirac 
spinors and 4x4 Dirac matrices. To derive these, however, it is useful to 
take a more systematic approach. Problem 3.6 presents a general method and 
gives some examples of its application. 

3.5 Quantization of the Dirac Field 

We are now ready to construct the quantum theory of the free Dirac field. 
From the Lagrangian 

£ = — m)ip = ib(iY'd[i ~ )4 ' 1 ; (3.83) 

we see that the canonical momentum conjugate to tp is iip\ and thus the 
Hamiltonian is 

H = Jd 3 xip[— 17 • V + m)ip = /d 3 ,^ H 7 ° 7 . V+ m 7 >. (3.84) 

If we define a = 7 ° 7 , (3 = 7 0 , you may recognize the quantity in brackets as 
the Dirac Hamiltonian of one-particle quantum mechanics: 

ho = —icy. ■ V + md. (3.85) 

How Not to Quantize the Dirac Field: 

A Lesson in Spin and Statistics 

To quantize the Dirac field in analogy with the Klein-Gordon field we would 
impose the canonical commutation relations 

[ipa(x),tpl( y)] = <5 (3) (x - y )Sab, (equal times) 


(3.86) 
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where a and b denote the spinor components of ip. This already looks peculiar: 
If ip{x) were real-valued, the left-hand side would be antisymmetric under 
x o y, while the right-hand side is symmetric. But ip is complex, so we 
do not have a contradiction yet. In fact, we will soon find that much worse 
problems arise when we impose commutation relations on the Dirac field. But 
it is instructive to see how far we can get, in order to better understand the 
relation between spin and statistics. So let us press on; just remember that 
the next few pages will eventually turn out to be a blind alley. 

Our first task is to find a representation of the commutation relations in 
terms of creation and annihilation operators that diagonalizes H. From the 
form of the Hamiltonian (3.84), it will clearly be helpful to expand ip(x) in a 
basis of eigenfunctions of ho- We know these eigenfunctions already from our 
calculations in Section 3.3. There we found that 


[i7°3o + * 7 ' V — m]u s (p)e ,p ' x = 0, 

so w s (p)e* p ' x are eigenfunctions of ho with eigenvalues E p . Similarly, the 
functions r> s (p)e -!p ' x (or equivalently, v s (— p)e +,p ’ x ) are eigenfunctions of 
ho with eigenvalues — E p . These form a complete set of eigenfunctions, since 
for any p there are two u’s and two u’s, giving us four eigenvectors of the 4x4 
matrix h o. 

Expanding ip in this basis, we obtain 

«*> = / ( £)3 JW/ P ~ £(“>*<P> +‘V‘<-P>). (3-87) 


where o* and 6 * are operator coefficients. (For now we work in the Schrodinger 
picture, where ip does not depend on time.) Postulate the commutation rela¬ 
tions 

'vC = K,b$] = (27r) 3 <5 (3) (p - q)(5 rs . (3.88) 

It is then easy to verify the commutation relations (3.86) for ip and 


[?y(x),7/; t (y)] 


/• d 3 pd 3 q 1 J(p.x- q .y) 

J ( 2 tt ) 6 \f2Ep 2 E q 

x E(K’<K(p)^q) + K^y(- p)u s (- q )) 7 0 

r,s 

f d 3 P 1 jp-tx-y) 

J ( 2 /t ) 3 2 E p 

x [{d°E p - 7 • p + m) + ( 7 ° E v + 7 • p - m) j 7 ° 

<5 (3) (x - y) x l 4x4 . (3.89) 


In the second step we have used the spin sum completeness relations (3.66) 
and (3.67). 
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We are now ready to write H in terms of the a’s and b’s. After another 
short calculation (making use of the orthogonality relations (3.60), (3.63), and 
(3.65)), we find 

H = J-0 :J £( E p“p ''4 - E *- b l' b l)■ < 3 ' 90 > 


Something is terribly wrong with the second term: By creating more and 
more particles with b\ we can lower the energy indefinitely. (It would not 
have helped to rename b -H- b\ since doing so would ruin the commutation 
relation (3.89).) 

We seem to be in rather deep trouble, but again let’s press on, and inves¬ 
tigate the causality of this theory. To do this we should compute [ip(x), ipHy)\ 
(or more conveniently, [ip(x), ip(y)]) at non-equal times and hope to get zero 
outside the light-cone. First we must switch to the Heisenberg picture and 
restore the time-dependence of ip and ip. Using the relations 


< i " , a' v , 1111 = a v r 


—iEpt 


jm h s -mt = ,, 


(3.91) 


we immediately have 

P’(*) = I -^-^=Y J (a s v u s {p)e- i ^ + ^V-l/'lc"'-’): 
(*) = I ( ^ 3 + b$v‘(p)e-»-*). 


We can now calculate the general commutator: 

I= j """ + <u>)K(i>v h ' •"') 


= (*. + *)* / 

= {ifix + ™,) ab [4>(x),4>(y)]. 


Since [(p(x), <p(y)] (the commutator of a real Klein-Gordon field) vanishes 
outside the light-cone, this quantity does also. 

There is something odd, however, about this solution to the causality 
problem. Let |0) be the state that is annihilated by all the a* and &*: a.p |0) = 
b s p (!) 0. Then 


[ipa(x),ip b (y)\ = (0| [ipa(x),ip b (y)\ |0) 

= <01 ipa(x)ip b (y) |0) - <0| ip b {y)ip a (x) |0), 



3.5 Quantization of tlie Dirac Field 55 


just as for the Klein-Gordon field. But in the Klein-Gordon case, we got one 
term of the commutator from each of these two pieces: the propagation of 
a particle from y to x was canceled by the propagation of an antiparticle 
from x to y outside the light-cone. Here both terms come from the first piece, 
(0| ip{x)ip{y) |0), since the second piece is zero. The cancellation is between 
positive-energy particles and negative-energy particles, both propagating from 
y to x. 

This observation can actually lead us to a resolution of the negative- 
energy problem. One of the assumptions we made in quantizing the Dirac 
theory must have been incorrect. Let us therefore forget about the postulated 
commutation relations (3.86) and (3.88), and see whether we can find a way 
for positive-energy particles to propagate in both directions. We will also have 
to drop our definition of the vacuum | 0 ) as the state that is annihilated by all 
a.p and 6 *. We will, however, retain the expressions (3.92) for ip(x) and ip{x) 
as Heisenberg operators, since if ip{x) and ip(x) solve the Dirac equation, they 
must be decomposable into such plane-wave solutions. 

First consider the propagation amplitude (0| ip{x)ip{y) |0), which is to rep¬ 
resent a positive-energy particle propagating from y to x. In this case we 
want the (Heisenberg) state ip{y) |0) to be made up of only positive-energy, 
or negative-frequency components (since a Heisenberg state 4/# = e +lHt 'f>s)- 
Thus only the term of yj(y) can contribute, which means that must 
annihilate the vacuum. Similarly ( 0 |D(;r) can contain only positive-frequency 
components. Thus we have 


( 0 | ip(x)ip(y) | 0 ) = < 0 | 


/ 


d 3 p 1 
(2tt ) 3 2E,~ 




X 


/ 


d 3 q 1 
(2tt) 3 


5>*V(g)e« |0). 


(3.93) 


We can say something about the matrix element (0|dpCiqt |0) even without 
knowing how to interchange a’ p and , by using translational and rotational 
invariance. If the ground state |0) is to be invariant under translations, we 
must have |0) = e* p ’ x |0). Furthermore, since creates momentum q, we 
can use Eq. (2.48) to compute 

( 0 |«X t | 0 ) = ( 0 |«X t e iP - x | 0 ) 

= e * (p - q )- x (ii, o) 

= e ,:(p - q) - x ( 0 | cdpQ^ | 0 ). 

This says that if (0|a£aqt |0) is to be nonzero, p must equal q. Similarly, it 
can be shown that rotational invariance of |0) implies r = s. (This should be 
intuitively clear, and can be checked after we discuss the angular momentum 
operator later in this section.) From these considerations we conclude that 
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the matrix element can be written 

< 0 | | 0 ) = (2tt) 3 S^(p - q )8” ■ A ( p ), 

where „4(p) is so far undetermined. Note, however, that if the norm of a state 
is always positive (as it should be in any self-respecting Hilbert space), A(p) 
must be greater than zero. We can now go back to (3.93), and write 

< 0 | MxW(y) | 0 ) = J '£ i u‘(p)u‘(p) A (p)e~ ip <*-'> ) 

=Sm^y +m)Mp)e ^’ K 

This expression is properly invariant under boosts only if A(p) is a Lorentz 
scalar, i.e., A(p)=A(p 2 ). Since p 2 = m 2 , A must be a constant. So finally we 
obtain 

(o[ iPa(x)~My) |o) = (*& + »>).,„ j-0f^ e ~ ip(x ~ v) ■ A ( 3 - 94 ) 

Similarly, in the amplitude (0| ip(y)ip(x) 10), we want the only contri¬ 
butions to be from the positive-frequency terms of ip(y) and the negative- 
frequency terms of tp(x). So a* still annihilates the vacuum, but 6* does not. 
Then by arguments identical to those given above, we have 

{0\fp b (y)fpa(x) |0) = ~{i? x +m) ab f ( 2^3 2/ l p , ;> ' " B > (3.95) 

where B is another positive constant. The minus sign is important; it comes 
from the completeness relation (3.67) for JZ vv and the sign of x in the ex¬ 
ponential factor. It implies that we cannot have (0| [ip(x),ip(y)] |0) = 0 out¬ 
side the light-cone: The two terms (3.94) and (3.95) would indeed cancel if 
,4 = —B, but this is impossible since ,4 and B must both be positive. 

The solution, however, is now at hand. By setting A = B = 1, it is easy 
to obtain (outside the light-cone) 

(0| ip a (x)ii> b (y) |0) = - <0| ip b (y)'ipa(x) |0). 

That is, the spinor fields anticommute at spacelike separation. This is enough 
to preserve causality, since all reasonable observables (such as energy, charge, 
and particle number) are built out of an even number of spinor fields; for any 
such observables 0\ and O 2 , we still have [Oi(x), 02 (y)] = 0 for (x — y) 2 < 0. 

And remarkably, postulating anticommutation relations for the Dirac field 
solves the negative energy problem. The equal-time anticommutation relations 
will be 

{*Pa(x),i>t( y)} = <5 (3) (x-y)(5 ab\ 

{v.•„(x). o,(y)} = !^(xl.r^(y)} = 0. 


(3.96) 
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We can expand ^(x) in terms of a* and b^ as before (Eq. (3.87)). The creation 
and annihilation operators must now obey 

K,<} = {b r p ,b$} = (■>-?<)«■ ( v - q )6 r ‘ (3.97) 

(with all other anticommutators equal to zero) in order that (3.96) be satisfied. 
Another computation gives the Hamiltonian, 

H = E { e p« - E ^ b l) ■ 

which is the same as before; still creates negative energy. However, the 
relation {b p ,b^} = (27r) 3 <5 (3 ^(p — q)<5 rs is symmetric between b p and b^. So 
let us simply redefine 

bp = 6 p + ; b^=¥ p . (3.98) 

These of course obey exactly the same anticommutation relations, but now 
the second term in the Hamiltonian is 

-Epb^b^ = +E p b s p % - (const). 

If we choose |0) to be the state that is annihilated by a p and b p , then all 
excitations of |0) have positive energy. 

What happened? To better understand this trick, let us abandon the field 
theory for a moment and consider a theory with a single pair of b and 
operators obeying {b,tf} = 1 and { b,b } = { 6 ^, 6 ^} = 0. Choose a state |0) 
such that b |0) = 0. Then |0) is a new state; call it |1). This state satisfies 
b |1) = |0) and |1) = 0. So b and act on a Hilbert space of only two states, 
|0) and |1). We might say that |0) represents an “empty” state, and that 
“fills” the state. But we could equally well call |1) the empty state and say 
that b = ¥ fills it. The two descriptions are completely equivalent, until we 
specify some observable that allows us to distinguish the states physically. In 
our case the correct choice is to take the state of lower energy to be the empty 
one. And it is less confusing to put the dagger on the operator that creates 
positive energy. That is exactly what we have done. 

Note, by the way, that since (E) 2 = 0, the state cannot be filled twice. 
More generally, the anticommutation relations imply that any multiparticle 
state is antisymmetric under the interchange of two particles: aj,a^| 0 ) = 
—a q a p |0>. Thus we conclude that if the ladder operators obey anticommuta- 
tioji relations, the corresponding particles obey Fermi-Dirac statistics. 

We have just shown that in order to insure that the vacuum has only 
positive-energy excitations, we must quantize the Dirac field with anticom¬ 
mutation relations; under these conditions the particles associated with the 
Dirac field obey Fermi-Dirac statistics. This conclusion is part of a more gen- 
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eral result, first derived by Pauli*: Lorentz invariance, positive energies, pos¬ 
itive norms, and causality together imply that particles of integer spin obey 
Bose-Einstein statistics, while particles of half-odd-integer spin obey Fermi- 
Dirac statistics. 

The Quantized Dirac Field 

Let us now summarize the results of the quantized Dirac theory in a systematic 
way. Since the dust has settled, we should clean up our notation: From now 
on we will write b p (the operator that lowers the energy of a state) simply 
as b p , and as 6|,. All the expressions we will need in our later work are 
listed below; corresponding expressions above, where they differ, should be 
forgotten. 

First we write the field operators: 

%K ' X) = J (0y=E( a r“ S W e "’ :,, '' ! + b$v'(p)e * x ); (3.99) 

**> = + (3100) 

The creation and annihilation operators obey the anticommutation rules 

{«;,<} = K,b$} = (27r) 3 <5 (3) (p - q )8”, (3.101) 

with all other anticommutators equal to zero. The equal-time anticommuta¬ 
tion relations for ib and yd are then 

{iMx),t^(y)} = d (3) (x -y )S ab ; 

•{H,(x).r/,(yl} = {t/'l(x),y; 6 t (y)} = 0. 

The vacuum |0) is defined to be the state such that 

«p |0) = b s p |0) = 0. 

The Hamiltonian can be written 

H = {4yY. E »{<‘£K + K%)- 

where we have dropped the infinite constant term that comes from anticom¬ 
muting 6* and b^. From this we see that the vacuum is the state of lowest 
energy, as desired. The momentum operator is 

p = / d 3 x #(-«V)D = j p( a p ta P + 6 p 6 p) • (3.105) 


*W. Pauli, Pliys. Rev. 58 , 716 (1940), reprinted in Schwinger (1958). A rigorous 
treatment is given by R. F. Streater and A. S. Wightman, PCT, Spin and Statistics, 
and All That (Benjamin/Cummings, Reading, Mass., 1964). 


(3.102) 

(3.103) 

(3.104) 
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Thus both a and b£ create particles with energy +E p and momentum p. 
We will refer to the particles created by as fermions and to those created 
by b as antifermions. 

The one-particle states 

|p,s) = V^<|0) (3.106) 

are defined so that their inner product 

<P,r|q,s) = 2S p (27r) 3 (5( 3 )(p - q)<T' (3.107) 

is Lorentz invariant. This implies that the operator U(A) that implements 
Lorentz transformations on the states of the Hilbert space is unitary, even 
though for boosts, Ai is not unitary. 

It will be reassuring to do a consistency check, to see that U( A) imple¬ 
ments the right transformation on ip(x). So calculate 

Utb(x)U~ 1 =U 3 J— ^(a^u'We-^+bgv'ipy^U- 1 . (3.108) 


We can concentrate on the first term; the second is completely analogous. 
Equation (3.106) implies that a p transforms according to 


U(A)a p U~ 1 (A) 



(3.109) 


assuming that the axis of spin quantization is parallel to the boost or rotation 
axis. To use this relation to evaluate (3.108), rewrite the integral as 


/ 


d 3 p 1 
(2tt)3 ^ 2E } p 


f d 3 p 1 

J (2tt) 3 2E p 




The second factor is transformed in a simple way by U, and the first is a 
Lorentz-invariant integral. Thus, if we apply (3.109) and make the substitution 
p = A p, Eq. (3.108) becomes 


U( A)^(®)Cf _1 (A) 


f d 3 p 1 

J (2tt) 3 2 E p 


J2 uS ( a ~ x p) V^ a f e ~ ip ' Ax 


+ ■■■. 


But u s (A 1 p) = A±u s (p), so indeed we have 


d 3 p 


umWU-'(A) = f JLJL-^ZA-ju'ip )a‘ f 


e -ip-Ax 


(3.110) 


= Ait/) (Ax). 


This result says that the transformed field creates and destroys particles 
at the point Ax, as it must. Note, however, that this transformation appears 
to be in the wrong direction compared to Eq. (3.2), where the transformed 
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field <j> was evaluated at A~ 3 x. The difference is that in Section 3.1 we imag¬ 
ined that we transformed a pre-existing field distribution that was measured 
by 4>{x). Here, we are transforming the action of <f>{x) in creating or destroy¬ 
ing particles. These two ways of implementing the Lorentz transformation 
work in opposite directions. Notice, though, that the matrix acting on ip and 
the transformation of the coordinate x have the correct relative orientation, 
consistent with Eq. (3.8). 

Next we should discuss the spin of a Dirac particle. We expect Dirac 
fermions to have spin 1 / 2 ; now we can demonstrate this property from our 
formalism. We have already shown that the particles created by ap and 
each come in two “spin” states: s = 1,2. But we haven’t proved yet that this 
“spin” has anything to do with angular momentum. To do this, we must write 
down the angular momentum operator. 

Recall that we found the linear momentum operator in Section 2.2 by 
looking for the conserved quantity associated with translational invariance. 
We can find the angular momentum operator in a similar way as a consequence 
of rotational invariance. Under a rotation (or any Lorentz transformation), the 
Dirac field ip transforms (in our original convention) according to 

ip{x) —> ip'(x) = Aiip(A~ 1 x). 

To apply Noether’s theorem we must compute the change in the field at a 
fixed point, that is, 


Sip = ip'(x) — ip(x) = Aiip(A 1 ;c) — ip(x). 

Consider for definiteness an infinitesimal rotation of coordinates by an angle 
0 about the 2-axis. The parametrization of this transformation is given just 
below Eq. (3.19): to 12 = -lo-ii = 0. Using the same parameters in Eq. (3.30), 
we find 

Ai as 1 - |= 1 - §6»S 3 . 

We can now compute 

Sip(x) = (l — ^6T, 3 )ip(t,x + 6y,y — 6x,z) — ip(x) 

= —6(xd y — yd x + ^T, 3 )ip(x) = 6Aip. 


The time-component of the conserved Noether current is then 
nr 

j ° = d(d 0 ip) A ^ = ic '" ( - r0 ” ~ yd * + ^ 3 )*P- 

Similar expressions hold for rotations about the x- and y- axes, so the angular 
momentum operator is 


J = 


jd 3 x ip * x (— -iV) + 



(3.111) 


For nonrelativistic fermions, the first term of (3.111) gives the orbital angular 
momentum. The second term therefore gives the spin angular momentum. 
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Unfortunately, the division of (3.111) into spin and orbital parts is not so 
straightforward for relativistic fermions, so it is not simple to write a general 
expression for this quantity in terms of ladder operators. 

To prove that a Dirac particle has spin 1/2, however, it suffices to consider 
particles at rest. In that case, the orbital term of (3.111) does not contribute, 
and we can easily write the spin term in terms of ladder operators. It is easiest 
to use the Schrodinger picture expression (3.87) for ?/(x): 


J z = 


d 3 p d 3 p 1 


S dX j ( 2 tt ) 6 ^/2Ep 2Ep' 

x £(vV'V) + &: P -« r ' t (-P , ))^(«> r(p) + b%v*(-pj). 


We would like to apply this operator to the one-particle zero-momentum state 
do* |0). This is most easily done using a trick: Since J z must annihilate the 
vacuum, J~a ^ |0) = [J z , a^j |0). The only nonzero term in this latter quantity 
has the structure [a^dp ,«q^] = (27r) 3 (5 (3) (p)a^(5 r the other three terms in 
the commutator either vanish or annihilate the vacuum. Thus we find 

10 ) = ^ Z(u rf ( 0 )^u S( 0 ) )a r o f | 0 ) = 1 °), 

r r 

where we have used the explicit form (3.47) of u( 0) to obtain the last expres¬ 
sion. The sum over r is accomplished most easily by choosing the spinors £ r 
to be eigenstates of a 3 . We then find that for = Q, the one-particle state 
is an eigenstate of./,. with eigenvalue + 1 / 2 , while for £ s = ( < j ) ), it is an eigen¬ 
state of J z with eigenvalue —1/2. This result is exactly what we expect for 
electrons. 

An analogous calculation determines the spin of a zero-momentum an¬ 
tifermion. But in this case, since the order of the b and terms in J z is 
reversed, we get an extra minus sign from evaluating [b p b^,bl\ = — [ 6 j, 6 p, ftj]■ 
Thus for positrons, the association between the spinors and the spin angular 
momentum is reversed: Q corresponds to spin —1/2, while (°) corresponds 
to spin +1/2. This reversal of sign agrees with the prediction of Dirac hole 
theory. From that viewpoint, a positron is the absence of a negative-energy 
electron. If the missing electron had positive ,J Z . its absence has negative . 

In summary, the angular momentum of zero-momentum fermions is given 

by 

|0) = ±±a s J |0), J x b$ |0) = |0), (3.112) 

where the upper sign is for £ s = Q and the lower sign is for = (°). 

There is one more important conserved quantity in the Dirac theory. In 
Section 3.4 we saw that the current j fl = is conserved. The charge 
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associated with this current is 

Q = J d 3 xip ] {x)ip{x) = J + b -P b -p)’ 

or, if we ignore another infinite constant, 

Q = (su 3 ) 

So ap creates fermions with charge +1, while creates antifermions with 
charge —1. When we couple the Dirac field to the electromagnetic field, we 
will see that Q is none other than the electric charge (up to a constant factor 
that depends on which type of particle we wish to describe; e.g., for electrons, 
the electric charge is Qe). 

In Quantum Electrodynamics we will use the spinor field ip to describe 
electrons and positrons. The particles created by a*/ are electrons; they have 
energy E p , momentum p, spin 1/2 with polarization appropriate to and 
charge +1 (in units of e). The particles created by are positrons; they have 
energy E p , momentum p, spin 1/2 with polarization opposite to that of £ s , 
and charge —1. The state ib a {x) |0) contains a positron at position x, whose 
polarization corresponds to the spinor component chosen. Similarly, ip a {x ) |0) 
is a state of one electron at position x. 

The Dirac Propagator 

Calculating propagation amplitudes for the Dirac field is by now a straight¬ 
forward exercise: 

<o| >P,MA( y) I0> = J E 

- *”)«f (3114) 

<o| | 0 ) = J s <(p)n(p)e- ,rh ~ x> 

= -m 

Just as we did for the Klein-Gordon equation, we can construct Green’s 
functions for the Dirac equation obeying various boundary conditions. For 
example, the retarded Green’s function is 

Sr{x - y) = 0(x° - y°) (0| {ip a (x),ib b (y)} |0). (3.116) 

It is easy to verify that 

Sr(x - y) = (i@ x + in) l) R (x - y), 


(3.117) 
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since on the right-hand side the term involving do9(x° — y°) vanishes. Using 
(3.117) and the fact that @@ = d 2 , we see that Sr is a Green’s function of 
the Dirac operator: 

{i@ x ~ m)S R (x - y) = iS {4) (x - y) ■ 1 4 X 4 - (3.118) 


The Green’s function of the Dirac operator can also be found by Fourier 
transformation. Expanding Sr(.t —y) as a Fourier integral and acting on both 
sides with (i <fl x — m ), we find 

iS i4) (x ~ y) = J ^(yy(y- m)e~ w ' ix ~ y) S R (p), (3.119) 


and hence 


Sr(p) 


i 

ji — TO 


i(y + m) 
p 2 — m 2 


(3.120) 


To obtain the retarded Green’s function, we must evaluate the p° integral in 
(3.120) along the contour shown on page 30. For :r° > y° we close the contour 
below, picking up both poles to obtain the sum of (3.114) and (3.115). For 
x° < y° we close the contour above and get zero. 

The Green’s function with Feynman boundary conditions is defined by 
the contour shown on page 31: 


Sf{x 


= f d4 P W+m) iv . (x -^ 

J (27r) 4 p 2 — m' 2 + ie 

— f (0| t/j(x)xp(y) |0) for x° > y° 
\ — (0| ip(y)xp(x) jo) for ;c° < y° 

= {0\TMx)$(y)\0), 


(close contour below) 
(close contour above) 

(3.121) 


where we have chosen to define the time-ordered product of spinor fields with 
an additional minus sign when the operators are interchanged. This minus 
sign is extremely important in the quantum field theory of fermions; we will 
meet it again in Section 4.7. 

As with the Klein-Gordon theory, the expression (3.121) for the Feynman 
propagator is the most useful result of this chapter. When we do perturbative 
calculations with Feynman diagrams, we will associate the factor Sf(p ) with 
each internal fermion line. 
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3.6 Discrete Symmetries of the Dirac Theory 

In the last section we discussed the implementation of continuous Lorentz 
transformations on the Hilbert space of the Dirac theory. We found that for 
each transformation A there was a unitary operator U( A), which induced the 
correct transformation on the fields: 

D(A)D(;r)tr 1 (A) = A||(Ax). (3.122) 

In this section we will discuss the analogous operators that implement various 
discrete symmetries on the Dirac field. 

In addition to continuous Lorentz transformations, there are two other 
spacetime operations that are potential symmetries of the Lagrangian: par¬ 
ity and time reversal. Parity, denoted by P, sends (t, x) —t (t, — x), reversing 
the handedness of space. Time reversal, denoted by T, sends (t,x) —> (—i,x), 
interchanging the forward and backward light-cones. Neither of these opera¬ 
tions can be achieved by a continuous Lorentz transformation starting from 
the identity. Both, however, preserve the Minkowski interval x 2 = t 2 — x 2 . In 
standard terminology, the continuous Lorentz transformations are referred to 
as the proper, orthochronous Lorentz group, L^. Then the full Lorentz group 
breaks up into four disconnected subsets, as shown below. 


4 A 

li = pl; 

“orthochronous” 

T 1 

1 T 


II 

^1 

Li = ptl\ 

“nonorthochronous' 

proper 

“improper” 



At the same time that we discuss P and T, it will be convenient to discuss a 
third (non-spacetime) discrete operation: charge conjugation, denoted by C. 
Under this operation, particles and antiparticles are interchanged. 

Although any relativistic field theory must be invariant under L^, it need 
not be invariant under P, T, or C. What is the status of these symmetry op¬ 
erations in the real world? From experiment, we know that three of the forces 
of Nature— the gravitational, electromagnetic, and strong interactions—are 
symmetric with respect to P, C , and T. The weak interactions violate G and 
P separately, but preserve CP and T. But certain rare processes (all so far 
observed involve neutral K mesons) also show CP and T violation. All obser¬ 
vations indicate that the combination CPT is a perfect symmetry of Nature. 

The currently accepted theoretical model of the weak interactions is the 
Glashow-Weinberg-Salam gauge theory, described in Chapter 20. This theory 
violates C and P in the strongest possible way. It is actually a surprise (though 
not quite an accident) that C and P happen to be quite good symmetries in the 
most readily observable processes. On the other hand, no one knows a really 
beautiful theory that violates CP. In the current theory, when there are three 
(or more) fermion generations, there is room for a parameter that, if nonzero, 
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causes CP violation. But the value of this parameter is no better understood 
than the value of the electron mass; the physical origin of CP violation remains 
a mystery. We will discuss this question further in Section 20.3. 

Parity 

With this introduction, let us now discuss the action of P, T, and C on Dirac 
particles and fields. First consider parity. The operator P should reverse the 
momentum of a particle without flipping its spin: 


Mathematically, this means that P should be implemented by a unitary op¬ 
erator (properly called U(P), but we’ll just call it P) which, for example, 
transforms the state a* |0) into a£ p |0). In other words, we want 

Pa s v P = i la ai p and Pb s p P = i lb b s _ p , (3.123) 

where rj a and i] b are possible phases. These phases are restricted by the con¬ 
dition that two applications of the parity operator should return observables 
to their original values. Since observables are built from an even number of 
fermion operators, this requires ijl = ±1. 

Just as a continuous Lorentz transformation is implemented on the Dirac 
field as the 4x4 constant matrix At, the parity transformation should also be 
represented by a 4 x 4 constant matrix. To find this matrix, and to determine 
>l a and we compute the action of P on ip{x). Using (3.123), we have 

/V'-'-i/' = J ( ^ 3 J— Y, (jla (p)e~ ipx + vtbiy^e^). (3.124) 


Now change variables to p = (p 0 ,— p). Note that p ■ x = p ■ (t, — x). Also 
p ■ a = p ■ a and p ■ a = p ■ a. This allows us to write 


u(p) 

v(p) = 


_ f 

yjp-at, A 
-s/P-at) 


VP ■ a ( 
V'P-at. 
s/p-at 
~V'P -at. 


= T°"(/V|: 

= -7 °v(p). 


So (3.124) becomes 


d 3 p 1 
( 2 tt ) 3 sfiEl 


E(^«|7°« S (p)e-^ (t - x) 


-ii;bfi°^(p)e m ’-^y 


Pip(x)P 
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This should equal some constant matrix times ip(t, —x), and indeed it works 
if we make if = —r) a . This implies 

VaVb = -Haifa = “ 1 - (3.125) 

Thus we have the parity transformation of ip{x) in its final form, 

P^(t,x)P = '/„?"'•(/• -x). (3.126) 

It will be very important (for example, in writing down Lagrangians) to 
know how the various Dirac field bilinears transform under parity. Recall that 
the five bilinears are 


xpxf i ?/)7 , '7 5 '0, iipj 5 i/j. (3.127) 


The factors of i have been chosen to make all these quantities Hermitian, as 
you can easily verify. (Any new term that we add to a Lagrangian must be 
real.) First we should compute 


Pf(t,x)P = Pf i (t,x)P~/° = (Pf(t,x)P)\° = —x)7°. (3.128) 

Then the scalar bilinear transforms as 

I’rrP = \i]a\ 2 f(t, -x)7°7°C(t ~ x ) = -x), (3.129) 

while for the vector we obtain 


p-ipy-ipp 


C7°7 /J 7 °f’(f 



-x) 

- x ) 


for p = 0, 
for // = 1, 2,3. 


(3.130) 


Note that the vector acquires the same minus sign on the spatial components 
as does the vector x f ‘. Similarly, the transformations of the pseudo-scalar and 
pseudo-vector are 


Pixp^fxpP = 'i^ 7 ° 7 5 7 °^(f, —x) = —ii0*f 5 i/j(t, — x); 


PiPy‘~f 5 ipP = •C7V'7 5 7 < V(t 



—?/>7 ; *7 5 ij) 
-t-'C7 ,< 7 a xp 


for = 0, 
for p, = 1,2,3. 


(3.131) 

(3.132) 


Just as we anticipated in Section 3.4, the “pseudo” signifies an extra mi¬ 
nus sign in the parity transformation. (The transformation properties of 
= ^xpa^xp are reserved for Problem 3.7.) Note that the transfor¬ 
mation properties of fermion bilinears were independent of r ) a , so there would 
have been no loss of generality in setting r] a = —rfo = 1 from the beginning. 

However, the relative minus sign (3.125) between the parity transforma¬ 
tions of a fermion and an antifermion has important consequences. Consider 
a fermion-antifermion state, a.pb^ |0). Applying P, we find P(a^b^ |0)) = 
— (afp&fq^ |0». Thus a state containing a fermion-antifermion pair gets an ex¬ 
tra ( — 1) under parity. This information is most useful in the context of bound 
states, in which the fermion and antifermion momenta are integrated with the 
Schrodinger wavefunction to produce a system localized in space. We consider 
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such states in detail in Section 5.3, but here we should remark that if the spa¬ 
tial wavefunction is symmetric under x —» —x, the state has odd parity, while 
if it is antisymmetric under x —> —x, the state has even parity. The L = 0 
bound states, for example, have odd parity; the J = 0 state transforms as a 
pseudo-scalar, while the three J = 1 states transform as the spatial compo¬ 
nents of a vector. These properties show up in selection rules for decays of 
positronium and quark-antiquark systems (see Problem 3.8). 

Time Reversal 

Now let us turn to the implementation of time reversal. We would like T to 
take the form of a unitary operator that sends a v to a._ p (and similarly for 
b p ) and ip(t,x) to ip(—t,x) (times some constant matrix). These properties, 
however, are extremely difficult to achieve, since we saw above that sending 
c/p to d_p instead sends (t, x) to (t, —x) in the expansion of ip. The difficulty is 
even more apparent when we impose the constraint that time reversal should 
be a symmetry of the free Dirac theory, [T, H] = 0. Then 

^(i,x) = e iHt ip{n)e~ iHt 
=> Tib(t, x)T = e im [Tip(x)T]e- im 
=*7W,x)T|0) =e im [TiP(x)T] |0) , 

assuming that H |0) = 0. The right-hand side is a sum of negative-frequency 
terms only. But if T is to reverse the time dependence of ip(t, x), then the left- 
hand side is (up to a constant matrix) ip(— t,x) |0) = e~ tHt ip(x ) |0), which is 
a sum of positive-frequency terms. Thus we have proved that T cannot be 
implemented as a linear unitary operator. 

What can we do? The way out is to retain the unitarity condition T* = 
T -1 , but have T act on c-numbers as well as operators, as follows: 

T(c-number) = (c-number)*T. (3.133) 

Then even if [T, H] = 0, the time dependence of all exponential factors is 
reversed: Te +lHt = e~ tHt T. Since all time evolution in quantum mechanics is 
performed with such exponential factors, this effectively changes the sign of t. 
Note that the operation of complex conjugation is nonlinear; T is referred to 
as an antilinear or antiunitary operator. 

In addition to reversing the momentum of a particle, T should also flip 
the spin: 


To quantify this, we must find a mathematical operation that flips a spinor £. 

In the earlier parts of this chapter, we denoted the spin state of a fermion 
by a label s = 1,2. In the remainder of this section, we will associate s with 
the physical spin component of the fermion along a specific axis. If this axis 
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has polar coordinates 6, <f>, the two-component spinors with spin up and spin 
down along this axis are 


m) = 


cos f 
e l(p sin f 


£U) = 


—e 


sin: 
cos | 


Let £ s = for s = 1,2. Also define 

r = -i<r 2 (tr 


(3.134) 


This quantity is the flipped spinor; from the explicit formulae, 


r = 


(3.135) 


The form of the spin reversal relation follows more generally from the identity 
(Ter 2 = a 2 {—cr*). This equation implies that, if £ satisfies ncr£ = +/ for some 
axis n, then 

(n • <t)(— i<J 2 ^*) = —ia 2 (—n ■ a)*/* = ia 2 (£*) = —(—ia 2 £*). 


Notice that, with this convention for the spin flip, two successive spin flips 
return a spin to ( — 1) times the original state. 

We now associate the various fermion spin states with these spinors. The 
electron annihilation operator a p destroys an electron whose spinor u s (p) 
contains / s . The positron annihilation operator 6* destroys a positron whose 
spinor v s (p ) contains £ -s : 

= <3136) 

As in Eq. (3.135), we define 

< = (ap, — a p)> b p s = (bl , -ft*). (3.137) 

We can now work out the relation between the Dirac spinors u and v and 
their time reversals. Define p = (p 0 ,— p). This vector satisfies the identity 
Vp ■ a a 2 = a 2 s/p ■ a*; to prove this, expand the square root as in (3.49). For 
some choice of spin and momentum, associated with the Dirac spinor u s (p), 
let u~ s {p ) be the spinor with the reversed momentum and flipped spin. These 
quantities are related by 

-s<~s = (Vp-v(-iv 2 Z s *)\ = (-iv 2 V P ■ g* Z s *\ 

\V'P ' & (—icr 2 ^*)/ \-ia' 2 s/p~a*/ s *) 

= „°) [»•(!>)]• = —7 , 7 3 [«*(!>)]•■ 

Similarly, for v s (p), 

^(p) = - 7 1 #[^(p)]| 

in this relation, v~ s contains = 
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Using the notation of Eq. (3.137), we define the time reversal transforma¬ 
tion of fermion annihilation operators as follows: 

TttpT = a“p, Tb s p T = bi;. (3.138) 

(An additional overall phase would have no effect on the rest of our discussion 
and is omitted for simplicity.) Relations (3.138) allow us to compute the action 
of T on the fermion field ip{x): 

JV«,x)T = J («;„»<?*'' + 6;V(i>K?')T 

= / +g P t [»’-(ri]‘ e -' pj! ) 

+ &~rV*(p)e- ,: P (t ’- x) ) 

= (-tVW-^x)- (3.139) 

In the last step we used p ■ (t, — x) = — p ■ (—t, x). Just as for parity, we have 
derived a simple transformation law for the fermion field ip{x). The relative 
minus sign in the transformation laws for particle and antiparticle is present 
here as well, implicit in the twice-flipped spinor in v~ s . 

Now we can check the action of T on the various bilinears. First we need 

TtpT = {TtpT) ] ( 7 °)* = i) + (- t,x) [- 7 J 7 3 ] f 7 ° = i)(-t,x) [ 7 1 y 3 ]. (3.140) 

Then the transformation of the scalar bilinear is 

Tipip(t,^)T = V , ( 7 1 7 3 )(- 7 1 7 3 )V , (— t, x) = -H/h/)(-t, x). (3.141) 

The pseudo-scalar acquires an extra minus sign when T goes through the i: 

Tiipp a %pT = = —iip^°ip(—t,x). 

For the vector, we must separately compute each of the four cases p = 0,1,2,3. 
After a bit of work you should find 

Tip^'ipT = f/ , (7 1 7 3 )(7 M )*(— J 1 'y 3 )4’ 

= J +$7 , V(-f,x) for p = 0; (3 1421 

\-^ 7 ^(-f,.x) for p = 1,2,3. ’ 

This is exactly the tranformation property we want for vectors such as the 
current density. You can verify that the pseudo-vector transforms in exactly 
the same way under time reversal. 
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Charge Conjugation 

The last of the three discrete symmetries is the particle-antiparticle sym¬ 
metry C. There will be no problem in implementing C as a unitary linear 
operator. Charge conjugation is conventionally defined to take a fermion with 
a given spin orientation into an antifermion with the same spin orientation. 
Thus, a convenient choice for the transformation of fermion annihilation op¬ 
erators is 

Ca s p C = b s p ; Cb s p C = a s p . (3.143) 

Again, we ignore possible additional phases for simplicity. 

Next we want to work out the action of C on ip(x). First we need a relation 
between v s (p) and u s (p). Using (3.136), and (3.134), 

( S MV = ( (-io-sjp ■ ct*£*Y = f 0 -i” 1 \('/P' a £\ 

■ s/p 7 W(-icr 2 {;*)J V icr' 2 \/p ■ a*£* ) V * 0 ' 2 0 / W'P ' &U ’ 

where £ stands for £ s . That is, 

U s (p) = -i t(v s (p))*, v s (p) = -n 2 (u s {p))*. (3.144) 

If we substitute (3.144) into the expression for the fermion field operator, and 
then transform this operator with C. we find 

C‘ir)C = J ( ^ 3 ^( h%(v s (p))'?<>•' _ 

= = —i'y 2 (^) T = —-i (?y7°7 2 ) T . (3.145) 

Note that C is a linear unitary operator, even though it takes ip —»■ ip*. 

Once again, we would like to know how C acts on fermion bilinears. First 
we need 


Cip{x)C = Cip^C-f 0 = (-iripfj 0 = (~ij°rip) T . (3.146) 

Working out the transformations of bilinears is a bit tricky, and it helps to 
write in spinor indices. For the scalar, 

C^tbC = (—iy°y 2 ip) T (—iiby 0 y 2 ) T = Amelia 

= +’4> dldeTeaTIablbci’c = V (3.147) 

= +lp1p. 

(The minus sign in the third step is from fermion anticommutation.) The 
pseudo-scalar is no more difficult: 

Ciipy°ipC = i{— iy°y 2 ip) T y 5 { —iipy°y 2 ) T = iipy°ip. (3.148) 

We must do each component of the vector and pseudo-vector separately. Not¬ 
ing that 7 ° and y 2 are symmetric matrices while 7 1 and y 3 are antisymmetric, 
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we eventually find 

CPpYipC = -ijrftpi (3.149) 

= +^ 7 m 7 5 ip. (3.150) 

Although the operator C interchanges ip and ip , it does not actually change 
the order of the creation and annihilation operators. Thus, if ip^°ip is defined 
to subtract the infinite constant noted above Eq. (3.113), this constant does 
not reappear in the process of conjugation by C. 

Summary of C, P, and T 

The transformation properties of the various fermion bilinears under C, P, and 
T are summarized in the table below. Here we use the shorthand ( — 1) M = 1 
for // = 0 and ( — l)'' = —1 for // = 1,2, 3. 



ip ip 

iipy'ip 

ipj^ip 

ip^'f'ip 

ipa^ip 

dp 

p 

+ 1 

-1 

(-1)" 

-(-i Y 

(-i)n-ir 

(-1)" 

T 

+ 1 

-1 

(-1) M 

(-i Y 


-(-1)' 

C 

+ 1 

+ 1 

-1 

+i 

-i 

+ 1 

CPT 

+1 

+ 1 

-1 

-l 

+i 

-1 


We have included the transformation properties of the tensor bilinear (see 
Problem 3.7), and also of the derivative operator. 

Notice first that the free Dirac Lagrangian Co = ippi^^d^ — m)ip is in¬ 
variant under C, P, and T separately. We can build more general quantum 
systems that violate any of these symmetries by adding to Co some pertur¬ 
bation SC. But SC must be a Lorentz scalar, and the last line of the table 
shows that all Lorentz scalar combinations of ip and ip are invariant under the 
combined symmetry CPT. Actually, it is quite generally true that one cannot 
build a Lorentz-invariant quantum field theory with a Hermitian Hamiltonian 
that violates CPTP 

Problems 

3.1 Lorentz group. Recall from Eq. (3.17) the Lorentz commutation relations, 

[j;m, JPV | = itfPjl** - g»Pj™ - gVVjKP + g^jvp). 

(a) Define the generators of rotations and boosts as 

/;' = W jk j jk , k‘ = 


iThis theorem and the spin-statistics theorem are proved with great care in 
Streater and Wiglitman, op. cit. 
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where i,j, A = 1, 2,3. An infinitesimal Lorentz transformation can then be writ¬ 
ten 

<F (l — ifl • L — i/3 ■ K)<E>. 

Write the commutation relations of these vector operators explicitly. (For exam¬ 
ple, [L l ,LP] = ie i2k L k .) Show that the combinations 

J + = ±(L + /K) and J_ = ^(L — / K) 

commute with one another and separately satisfy the commutation relations of 
angular momentum. 

(b) The finite-dimensional representations of the rotation group correspond precisely 
to the allowed values for angular momentum: integers or half-integers. The result 
of part (a) implies that all finite-dimensional representations of the Lorentz group 
correspond to pairs of integers or half integers, (/_(-,/_), corresponding to pairs of 
representations of the rotation group. Using the fact that J = <r/2 in the spin- 
1/2 representation of angular momentum, write explicitly the transformation 
laws of the 2-component objects transforming according to the (4-0) and (0, 4) 
representations of the Lorentz group. Show that these correspond precisely to 
the transformations of 4’L and Ur given in (3.37). 

(c) The identity a T = —a 2 aa 2 allows us to rewrite the fL transformation in the 
unitarilv equivalent form 

i' -5- 4<'{ 1 + i6 ■ J + /3 ■ y), 

where ip' = U|V 2 . Using this law, we can represent the object that transforms 
as (4, i) as a 2 x 2 matrix that has the U.R transformation law on the left and, 
simultaneously, the transposed 4’L transformation on the right. Parametrize this 
matrix as 

( V° + U 3 V 1 - iV 2 \ 
y^U 1 + iV 2 V°-V 3 ]' 

Show that the object V M transforms as a 4-vector. 

3.2 Derive the Gordon identity , 

u(p'h^u(p) = u(p') P 2m P + 2 J u(p), 

where q = (p 1 —p). We will put this formula to use in Chapter 6. 

3.3 Spinor products. (This problem, together with Problems 5.3 and 5.6, intro¬ 
duces an efficient computational method for processes involving massless particles.) 
Let A’q , Aj be fixed 4-vectors satisfying Aq = 0, Aj = — 1, Aq • k% = 0. Define basic 
spinors in the following way: Let upq be the left-handed spinor for a fermion with 
momentum A’o- Let uro = 1/iUp o- Then, for any p such that p is lightlike (p 2 = 0), 
define 

ul{p) = —===j/uro and u R (p) = —===pfu Lfi . 

V 2 P ' Ao V 2 P ' A’o 

This set of conventions defines the phases of spinors unambiguously (except when p is 
parallel to Ao). 

(a) Show that = 0. Show that, for any lightlike p, j/up(p) = j/u R (p) = 0. 
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(b) For the choices ko = : /•'.().(). E). k\ = (0, 1,0,0), construct urq, ur o, up(p ), 

and ur(p ) explicitly. 

(c) Define the spinor products s(pi,po) and f(pi,p 2 ), for pi,p 2 lightlike, by 

s(pl,P2) = ur(pi)ul(p 2% t(pi,p- 2 ) = u L (pi)u R (p-2). 

Using the explicit forms for the u.\ given in part (b), compute the spinor products 
explicitly and show that t(pi,p 2 ) = ( s (p 2 ,pl))* and s(pi,p 2 ) = — s(p 2 ,pi)- In 
addition, show that 

|«(P1»P2)|‘ ! = 2pi • P2- 

Thus the spinor products are the square roots of 4-vector dot products. 

3.4 Majorana fermions. Recall from Eq. (3.40) that one can write a relativistic 
equation for a massless 2-component fermion field that transforms as the upper two 
components of a Dirac spinor (g/’i)- Call such a 2-component field Xa(x), a = 1,2. 

(a) Show that it is possible to write an equation for \(x) as a massive field in the 
following way: 

ia ■ d\ — ima 2 \* = 0. 

That is, show, first, that this equation is relativistically invariant and, second, 
that it implies the Klein-Gordon equation, (<9 2 + m 2 )x = 0- This form of the 
fermion mass is called a Majorana mass term. 

(b) Does the Majorana equation follow from a Lagrangian? The mass term would 
seem to be the variation of (o 2 ) a bX* a xl'i however, since a 2 is antisymmetric, this 
expression would vanish if \( x ) were an ordinary c-number field. When we go to 
quantum field theory, we know that \(x) will become an anticommuting quan¬ 
tum field. Therefore, it makes sense to develop its classical theory by considering 
x(x) as a classical anticommuting field, that is, as a field that takes as values 
Grassma.nn numbers which satisfy 

a,3 = —13a for any cv, / 3 . 

Note that this relation implies that a 2 = 0. A Grassmann field £(x) can be 
expanded in a basis of functions as 

n 

where the <l> n (x) are orthogonal c-number functions and the ce n are a set of 
independent Grassmann numbers. Define the complex conjugate of a product of 
Grassmann numbers to reverse the order: 

(a/3)* =13* a* = -a* (3*. 

This rule imitates the Hermitian conjugation of quantum fields. Show that the 
classical action, 

S = jd 4 .x \x ] io ■ dx + l -^-{x T o 2 x ~ \ f o- 2 X*)], 

(where x^ = ( X *) T ) is rea l (S* = S), and that varying this S with respect to \ 
and y* yields the Majorana equation. 
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(c) Let us write a 4-component Dirac field as 



and recall that the lower components of ip transform in a way equivalent by a 
unitary transformation to the complex conjugate of the representation t/'L- In 
this way, we can rewrite the 4-component Dirac field in terms of two 2-component 
spinors: 

4'l{v) = Xl( x ), 4 'r{x) = 

Rewrite the Dirac Lagrangian in terms of \i and X’2 and note the form of the 
mass term. 

(d) Show that the action of part (c) has a global symmetry. Compute the divergences 
of the currents 

j " = vV'\. J ;j = x\^x l - xt^x-2 , 

for the theories of parts (b) and (c), respectively, and relate your results to the 
symmetries of these theories. Construct a theory of N free massive 2-component 
fermion fields with O(N) symmetry (that is, the symmetry of rotations in an 
W-dimensional space). 

(e) Quantize the Majorana theory of parts (a) and (b). That is, promote x( x ) to a 
quantum field satisfying the canonical anticommutation relation 

{l.> : x : - \v,y)} = <W (3) (x - y), 

construct a Hermitian Hamiltonian, and find a representation of the canonical 
commutation relations that diagonalizes the Hamiltonian in terms of a set of 
creation and annihilation operators. (Hint: Compare x(.t) to the top two com¬ 
ponents of the quantized Dirac field.) 

3.5 Supersymmetry. It is possible to write field theories with continuous symme¬ 
tries linking fermions and bosons; such transformations are called supersymmetries. 

(a) The simplest example of a supersymmetric field theory is the theory of a free 
complex boson and a free Weyl fermion, written in the form 

£ = • d X + F* F. 

Here F is an auxiliary complex scalar field whose field equation is F = 0. Show 
that this Lagrangian is invariant (up to a total divergence) under the infinitesi¬ 
mal tranformation 

S(P = -ie T a 2 x, 

5x = eF + a ■ d<pa 2 e *, 

SF — —ie^a ■ d\, 

where the parameter e a is a 2-component spinor of Grassmann numbers. 

(b) Show that the term 

A£ = [m<pF + 7jirnx T <? 2 \] + (complex conjugate) 
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is also left invariant by tlie transformation given in part (a). Eliminate F from 
the complete Lagrangian £ + A£ by solving its field equation, and show that 
the fermion and boson fields <f> and X are given the same mass. 

(c) It is possible to write supersymmetric nonlinear field equations by adding cubic 
and liigher-order terms to the Lagrangian. Show that the following rather general 
field theory, containing the field (<"/>,:, x,:), i = 1 ,..., n, is supersymmetric: 


£ = d^O-iP'Oi + \]ia • d Xi + /'/ I) 


where W[<p] is an arbitrary function of the (pi , called the superpotential. For the 
simple case n = 1 and W = g<p 3 / 3, write out the field equations for (p and y 
(after elimination of F). 


3.6 Fierz transformations. Let i = 1 ,..., 4, be four 4-component Dirac 
spinors. In the text, we proved the Fierz rearrangement formulae (3.78) and (3.79). 
The first of these formulae can be written in 4-component notation as 


Ml 7^ 



«2«37m 




«4«37m 



U. 2 - 


In fact, there are similar rearrangement formulae for any product 

(hi L ' «£)D'::I 'Vl :. 

where L'^r 5 are any of the 16 combinations of Dirac matrices listed in Section 3.4. 

(a) To begin, normalize the 16 matrices L ' 1 to the convention 

tr[T yl r s ] = 4 5 Ab . 

This gives L ' 1 = {l, 7 0 , * 7 J ,. .. }; write all 16 elements of this set. 

(b) White the general Fierz identity as an equation 

(uiT A U2)(u 3 T B u i ) = ^ C AB C d(uiT G u,i)(u 3 T D u, 2 ), 

C,D 

with unknown coefficients C AB qd- Using the completeness of the 16 T ' 1 matri¬ 
ces, show that 

C AB C d = n [l Si *1 


(c) Work out explicitly the Fierz transformation laws for the products (wiW 2 )(w 3 W 4 .) 
and (? 1 i 7 "« 2 )(m 37 ;P‘ 4 )- 

3.7 This problem concerns the discrete symmetries P, C, and T. 

(a) Compute the transformation properties under P, C, and T of the antisymmetric 

tensor fermion bilinears, , with = ■|[ 7 ,J , 7 ,y ]. This completes the table 

of the transformation properties of bilinears at the end of the chapter. 

(b) Let <p(x) be a complex-valued Klein-Gordon field, such as we considered in Prob¬ 
lem 2.2. Find unitary operators P, C and an antiunitarv operator T (all defined 
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in terms of their action on the annihilation operators a p and bp for the Klein- 
Gordon particles and antiparticles) that give the following tranformations of the 
Klein-Gordon field: 

P 4>(t, x) P = 0(f,-x); 

T <j>(t,x.)T = 0(—7, x); 

C ci: I. X j C = </>*(f,x). 

Find the transformation properties of the components of the current 

j^ = i(4>*d^4>-d^4>*4>) 


under P, C, and T. 

(c) Show that any Hermitian Lorentz-scalar local operator built from i/’(a:), <j>(x), 
and their conjugates has CPT = +1. 

3.8 Bound states. Two spin-1/2 particles can combine to a state of total spin either 
0 or 1. The wavefunctions for these states are odd and even, respectively, under the 
interchange of the two spins. 

(a) Use this information to compute the quantum numbers under P and C of all 
electron-positron bound states with S, P, or D wavefunctions. 

(b) Since the electron-photon coupling is given by the Hamiltonian 



where j tl is the electric current, electrodynamics is invariant to P and C if 
the components of the vector potential have the same P and C parity as the 
corresponding components of j M . Show that this implies the following surprising 
fact: The spin-0 ground state of positronium can decay to 2 photons, but the 
spin-1 ground state must decay to 3 photons. Find the selection rules for the 
annihilation of higher positronium states, and for 1-plioton transitions between 
positronium levels. 
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Interacting Fields and Feynman Diagrams 


4.1 Perturbation Theory—Philosophy and Examples 

We have now discussed in some detail the quantization of two free field theories 
that give approximate descriptions of many of the particles found in Nature. 
Up to this point, however, free-particle states have been eigenstates of the 
Hamiltonian; we have seen no interactions and no scattering. In order to obtain 
a closer description of the real world, we must include new, nonlinear terms 
in the Hamiltonian (or Lagrangian) that will couple different Fourier modes 
(and the particles that occupy them) to one another. To preserve causality, 
we insist that the new terms may involve only products of fields at the same 
spacetime point: [d»(;c)] 4 is fine, but cp(x)cp(y) is not allowed. Thus the terms 
describing the interactions will be of the form 

Hint = J d 3 X 'Hint[<p(x)\ = - j d 3 X £ int [<f>(x)] ■ 

For now we restrict ourselves to theories in which T-L- mt (= —Hint) is a function 
only of the fields, not of their derivatives. 

In this chapter we will discuss three important examples of interacting 
field theories. The first is “phi-fourth” theory, 

H = y<Vb- - :;'»V - ^4> 4 , (4.1) 

where A is a dimensionless coupling constant. (A cp 3 interaction would be a bit 
simpler, but then the energy would not be positive-definite unless we added 
a higher even power of cp as well.) Although we are introducing this theory- 
now for purely pedagogical reasons (since it is the simplest of all interacting 
quantum theories), models of the real world do contain < p 4 interactions; the 
most important example in particle physics is the self-interaction of the Higgs 
field in the standard electroweak theory. In Part II, we will see that cp 4 theory 
also arises in statistical mechanics. The equation of motion for cp 4 theory is 

(<9 2 +m 2 )cf> = (4-2) 


77 
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which cannot be solved by Fourier analysis as the free Klein-Gordon equation 
could. In the quantum theory we impose the equal-time commutation relations 

[d>( x ), 7r(y)] = «<5 (3) (x - y), 

which are unaffected by Ant- (Note, however, that if Ant contained d t ,<p, the 
definition of 7r(x) would change.) It is an easy exercise to write down the 
Hamiltonian of this theory and find the Heisenberg equation of motion for 
the operator (j>(x); the result is the same as the classical equation of motion 
(4.2), just as it was in the free theory. 

Our second example of an interacting field theory will be Quantum Elec¬ 
trodynamics: 


£qED — £I)j r;L c T ^Maxwell "■ Ant 

= 4>(i@ - m)ip - \iFpv) 2 - wp^ipA^ 


(4.3) 


where A M is the electromagnetic vector potential, = d^A„ — d v A^ is the 
electromagnetic field tensor, and e = — \e\ is the electron charge. (To describe 
a fermion of charge Q, replace e with Q. If we wish to consider several species 
of charged particles at once, we simply duplicate £Dirac and Ant for each 
additional species.) That such a simple Lagrangian can account for nearly 
all observed phenomena from macroscopic scales down to 10 -13 cm is rather 
astonishing. In fact, the QED Lagrangian can be written even more simply: 


£qed = ' •{/'//) - m)ip - -f(Iv) 2 , (4.4) 


where D t , is the gauge covariant derivative , 


D t , = + ieA^x). 


(4.5) 


A crucial property of the QED Lagrangian is that it is invariant under the 
gauge transformation 

ip{x) ->■ e ta{x] ip(x), A fl ->• A tl - ^df,a(x), (4.6) 

which is realized on the Dirac field as a local phase rotation. This invariance 
under local phase rotations has a fundamental geometrical significance, which 
motivates the term covariant derivative. For our present purposes, though, it 
is sufficient just to recognize (4.6) as a symmetry of the theory. 

The equations of motion follow from (4.3) by the canonical procedure. 
The Euler-Lagrange equation for ip is 


(il/> — m)'ip(x) = 0, 


(4.7) 


which is just the Dirac equation coupled to the electromagnetic field by the 
minimal coupling prescription, d —)• D. The Euler-Lagrange equation for A v 
is 


(>n / = etPYP’ = ej v . 


(4.8) 
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These are the inhomogeneous Maxwell equations, with the current density 
j" = ip 0 v ip given by the conserved Dirac vector current (3.73). As with 0 A 
theory, the equations of motion can also be obtained as the Heisenberg equa¬ 
tions of motion for the operators ip(x) and A /X (x). This is easy to verify for 
ip(x): we have not yet discussed the quantization of the electromagnetic field. 

In fact, we will not discuss canonical quantization of the electromagnetic 
field at all in this book. It is an awkward subject, essentially because of gauge 
invariance. Note that since A 0 does not appear in the Lagrangian (4.3), the 
momentum conjugate to A 0 is identically zero. This contradicts the canonical 
commutation relation [A°(x), 7T°(y)] = *<5(x — y). One solution is to quan¬ 
tize in Coulomb gauge, where V • A = 0 and A 0 is a constrained, rather than 
dynamical, variable; but then manifest Lorentz invariance is sacrificed. Alter¬ 
natively, one can quantize the field in Lorentz gauge, d fl A' J = 0. It is then 
possible to modify the Lagrangian, adding an A 0 term. One obtains the com¬ 
mutation relations [A # '(x), A"(y)j = —ig llu 8{-x — y), essentially the same as 
four Klein-Gordon fields. But the extra minus sign in [A 0 , A 0 ] leads to another 
(surmountable) difficulty: states created by a*A have negative norm.* 

The Feynman rules for calculating scattering amplitudes that involve pho¬ 
tons are derived more easily in the functional integral formulation of field the¬ 
ory, to be discussed in Chapter 9. That method has the added advantage of 
generalizing readily to the case of non-Abelian gauge fields, as we will see 
in Part III. In the present chapter we will simply guess the Feynman rules 
for photons. This will actually be quite easy after we derive the rules for an 
analogous but simpler theory, Yukawa theory: 

^Yukawa — d;jj-f- t^Klein—Gordon tf*'lp0- (4.9) 

This will be our third example. It is similar to QED, but with the photon 
replaced by a scalar particle 0 . The interaction term contains a dimensionless 
coupling constant g, analogous to the electron charge e. Yukawa originally 
invented this theory to describe nucleons {ip) and pions ( 0 ). In modern particle 
theory, the Standard Model contains Yukawa interaction terms coupling the 
scalar Higgs field to quarks and leptons; most of the free parameters in the 
Standard Model are Yukawa coupling constants. 

Having written down our three paradigm interactions, let us pause a mo¬ 
ment to discuss what other interactions could be found in Nature. At first it 
might seem that the list would be infinite; even for a scalar theory we could 
write down interactions of the form <p n for any n. But remarkably, one simple 
and reasonable axiom eliminates all but a few of the possible interactions. That 
axiom is that the theory be renormalizable, and it arises as follows. Higher- 
order terms in perturbation theory, as mentioned in Chapter 1, will involve 


*Excellent treatments of both quantization procedures are readily available. For 
Coulomb gauge quantization, see Bjorken and Drell (1965), Chapter 14; for Lorentz 
gauge quantization, see Mandl and Shaw (1984), Chapter 5. 
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integrals over the 4-momenta of intermediate (“virtual”) particles. These in¬ 
tegrals are often formally divergent, and it is generally necessary to impose 
some form of cut-off procedure; the simplest is just to cut off the integral at 
some large but finite momentum A. At the end of the calculation one takes 
the limit A —)• oo, and hopes that physical quantities turn out to be indepen¬ 
dent of A. If this is indeed the case, the theory is said to be renormalizable. 
Suppose, however, that the theory includes interactions whose coupling con¬ 
stants have the dimensions of mass to some negative power. Then to obtain 
a dimensionless scattering amplitude, this coupling constant must be multi¬ 
plied by some quantity of positive mass dimension, and it turns out that this 
quantity is none other than A. Such a term diverges as A —> oo, so the theory 
is not renormalizable. 

We will discuss these matters in detail in Chapter 10. For now we merely 
note that any theory containing a coupling constant with negative mass di¬ 
mension is not renormalizable. A bit of dimensional analysis then allows us to 
throw out nearly all candidate interactions. Since the action S = f Cd 4 x is 
dimensionless, C must have dimension (mass) 4 (or simply dimension 4). From 
the kinetic terms of the various free Lagrangians, we note that the scalar and 
vector fields <j> and A fl have dimension 1, while the spinor field ib has dimension 
3/2. We can now tabulate all of the allowed renormalizable interactions. 

For theories involving only scalars, the allowed interaction terms are 

lio''' and Ao : . 

The coupling constant p. has dimension 1, while A is dimensionless. Terms of 
the form cp n for n > 4 are not allowed, since their coupling constants would 
have dimension 4 — n. Of course, more interesting theories can be obtained by 
including several scalar fields, real or complex (see Problem 4.3). 

Next we can add spinor fields. Spinor self-interactions are not allowed, 
since ip s (besides violating Lorentz invariance) already has dimension 9/2. 
Thus the only allowable new interaction is the Yukawa term, 

gipipcP, 

although similar interactions can also be constructed out of Weyl and Majo- 
rana spinors. 

When we add vector fields, many new interactions are possible. The most 
familiar is the vector-spinor interaction of QED, 

eipy^ipA^. 

Again it is easy to construct similar terms out of Weyl and Majorana spinors. 
Less important is the scalar QED Lagrangian, 

£ = | D fl ,0 1 2 — m 2 1 0 1 2 , which contains e <pd ,, <p*, e 2 \(p\ 2 A 2 . 

This is our first example of a derivative interaction; quantization of this theory 
will be much easier with the functional integral formalism, so we postpone its 
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discussion until Chapter 9. Other possible Lorentz-invariant terms involving 
vectors are 

.r’i<Vl"! and A 4 . 

Although it is far from obvious, these terms lead to inconsistencies unless 
their coupling constants are precisely chosen on the basis of a special type of 
symmetry, which must involve several vector fields. This symmetry underlies 
the non-Abelian gauge theories , which will be the main subject of Part III. A 
mass term \m 2 A 2 for vector fields is also inconsistent, except in the special 
case where it is added to QED; in any case, it breaks (Abelian or non-Abelian) 
gauge invariance. 

This exhausts the list of possible Lagrangians involving scalar, spinor, and 
vector particles. It is interesting to note that the currently accepted models 
of the strong, weak, and electromagnetic interactions include all of the types 
of interactions listed above. The three paradigm interactions to be studied in 
this chapter cover nearly half of the possibilities; we will study the others in 
detail later in this book. 

The assumption that realistic theories must be renormalizable is cer¬ 
tainly convenient, since a nonrenormalizable theory would have little pre¬ 
dictive power. However, one might still ask why Nature has been so kind as to 
use only renormalizable interactions. One might have expected that the true 
theory of Nature would be a quantum theory of a much more general type. 
But it can be shown that, however complicated a fundamental theory appears 
at very high energies, the low-energy approximation to this theory that we 
see in experiments should be a renormalizable quantum field theory. We will 
demonstrate this in Section 12.1. 

At a more practical level, the preceding analysis highlights a great dif¬ 
ference in methodology between nonrelativistic quantum mechanics and rela¬ 
tivistic quantum field theory. Since the potential U(r) that appears in the 
Schrodinger equation is completely arbitrary, nonrelativistic quantum me¬ 
chanics puts no limits on what interactions can be found in the real world. But 
we have just seen that quantum field theory imposes very tight constraints 
on Nature (or vice versa). Taken literally, our discussion implies that the only 
tasks left for particle physicists are to enumerate the elementary particles that 
exist and to measure their masses and coupling constants. While this view¬ 
point is perhaps overly arrogant, the fact that it is even thinkable is surely 
a sign that particle physicists are on the right track toward a fundamental 
theory. 

Given a set of particles and couplings, we must still work out the ex¬ 
perimental consequences. How do we analyze the quantum mechanics of an 
interacting field theory? It would be nice if we could explicitly solve at least 
a few examples (that is, find the exact eigenvalues and eigenvectors as we did 
for the free theories) to get a feel for the properties of interacting theories. 
Unfortunately, this is easier said than done. No exactly solvable interacting 
field theories are known in more than two spacetime dimensions, and even 
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there the solvable models involve special symmetries and considerable tech¬ 
nical complication.t Studying these theories would be interesting, but hardly 
worth the effort at this stage. Instead we will fall back on a much simpler and 
more generally applicable approach: Treat the interaction term H- lnt as a per¬ 
turbation, compute its effects as far in perturbation theory as is practicable, 
and hope that the coupling constant is small enough that this gives a reason¬ 
able approximation to the exact answer. In fact, the perturbation series we 
obtain will turn out to be very simple in structure; through the use of Feyn¬ 
man diagrams it will be possible at least to visualize the effects of interactions 
to arbitrarily high order. 

This simplification of the perturbation series for relativistic field theories 
was the great advance of Tomonaga, Schwinger, and Feynman. To achieve 
this simplification, each, independently, found a way to reformulate quan¬ 
tum mechanics to remove the special role of time, and then applied his new 
viewpoint to recast each term of the perturbation expansion as a spacetime 
process. We will develop quantum field theory from a spacetime viewpoint, us¬ 
ing Feynman’s method of functional integration , in Chapter 9. In the present 
chapter we follow a more pedestrian line of analysis, first developed by Dyson, 
to derive the spacetime picture of perturbation theory from the conventional 
machinery of quantum mechanics.* 

4.2 Perturbation Expansion of Correlation Functions 

Let us then begin the study of perturbation theory for interacting fields, aim¬ 
ing toward a formalism that will allow us to visualize the perturbation series 
as spacetime processes. Although we will not need to reformulate quantum 
mechanics, we will rederive time-dependent perturbation theory in a form 
that is convenient for our purposes. Ultimately, of course, we want to calcu¬ 
late scattering cross sections and decay rates. For now, however, let us be less 
ambitious and try to calculate a simpler (but more abstract) quantity, the 
two-point correlation function, or two-point Green’s function, 

{Q\T4>(x)4>(y)\Q) t (4.10) 

in (f> 4 theory. We introduce the notation | fi) to denote the ground state of the 
interacting theory, which is generally different from 10), the ground state of 
the free theory. The time-ordering symbol T is inserted for later convenience. 
The correlation function can be interpreted physically as the amplitude for 
propagation of a particle or excitation between y and x. In the free theory, it 

*A brief survey of exactly solvable quantum field theories is given in the Epilogue. 

+For a historical account of the contributions of Tomonaga, Schwinger, Feynman, 
and Dyson, see Schweber (1994). 
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is simply the Feynman propagator: 

{0\T4>(x)4>(y) |0) f = D f (x — y) = 1 - 7 —^- (4- 11 ) 

J ( 27 TJ 4 jr — to“ + it 

We would like to know how this expression changes in the interacting the¬ 
ory. Once we have analyzed the two-point correlation function, it will be easy 
to generalize our results to higher correlation functions in which more than 
two field operators appear. In Sections 4.3 and 4.4 we will continue the anal¬ 
ysis of correlation functions, eventually developing the formalism of Feynman 
diagrams for evaluating them perturbatively. Then in Sections 4.5 and 4.6 
we will learn how to calculate cross sections and decay rates using the same 
techniques. 

To attack this problem, we write the Hamiltonian of <f> 4 theory as 

H = H 0 + Hint = -fflClein-Gordon + j d 3 X — ^(x). (4.12) 

We want an expression for the two-point correlation function (4.10) as a power 
series in A. The interaction Hamiltonian Hint enters (4.10) in two places: first, 
in the definition of the Heisenberg field, 

(.6(./■) - c' 111 o(x)e (4.13) 

and second, in the definition of |fi). We must express both (j>(x) and |f2) in 
terms of quantities we know how to manipulate: free field operators and the 
free theory vacuum |0). 

It is easiest to begin with At any fixed time to, we can of course 

expand <p as before in terms of ladder operators: 

= )■ 

Then to obtain <p(t,x) for t ^ to, we just switch to the Heisenberg picture as 
usual: 

o(l. x) = < u,il 1 ' o(/o. x)c 


For A = 0, H becomes H 0 and this reduces to 

<p(t,x) | A=Q = ,'"Xl 'rd 0 ( /(| _ x | , »// n t° = ^( f) x). ( 4 . 14 ) 

When A is small, this expression will still give the most important part of 
the time dependence of <p(x), and thus it is convenient to give this quantity 
a name: the interaction picture field, 0 7 (t,x). Since we can diagonalize Ho, it 
is easy to construct <pj explicitly: 


<M*,x) = 


d 3 p 


(2tt) 3 v /2 E , 


■ ( a p* 


1 + oj, e lp 


*) 


-t—t o 


(4.15) 


This is just the familiar expression for the free field from Chapter 2. 
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The problem now is to express the full Heisenberg field <j> in terms of 
Formally, it is just 


0 (f,x) = 

= U i (t,t 0 )4> I (t,x)U(t, t 0 ), 
where we have defined the unitary operator 

U(t,t 0 ) = e iHoit - to) e- iH(t - to \ 


(4.16) 


(4.17) 


known as the interaction picture propagator or time-evolution operator. We 
would like to express U{t,to) entirely in terms of (j> t , for which we have an 
explicit expression in terms of ladder operators. To do this, we note that 
U(t,to) is the unique solution, with initial condition U{to,to) = 1, of a simple 
differential equation (the Schrodinger equation): 

t 0 ) = e iHo{t - to) (H - H 0 )e- ,H{t - to) 

= e (// inI ) ( ’'d/ ' 

_ e iH 0 (t-t 0 ) e iH a (t-t a ) e -iH(t-to) 

= H I (t)U(t,t 0 ), (4.18) 

where 

tfj(f) = e iHo{t - to] {H int )e- iHo(t - to) = Jd 3 x ^(j>j (4.19) 

is the interaction Hamiltonian written in the interaction picture. The so¬ 
lution of this differential equation for U(t,to) should look something like 
U ~ exp (—iHjt); this would be our desired formula for U in terms of cp r 
Doing it more carefully, we will show that the actual solution is the following 
power series in A: 


t t\ 


U(t,t 0 ) = 1 + ( /) J dh Htih) + {-if j dh J dt 2 HjitfHjit 2 ) 


+ - 


to to to 

t t\ t2 

( iff dhj d.t 2 J dt z HhtfHhtfHhh) 


(4.20) 


+ 


to to to 


To verify this, just differentiate: Each term gives the previous one times 
—iHi{t). The initial condition U(t,to) = 1 for t = to is obviously satisfied. 

Note that the various factors of Hj in (4.20) stand in time order, later 
on the left. This allows us to simplify the expression considerably, using the 
time-ordering symbol T. The Hj term, for example, can be written 


t ti t t 

j dh j dh HhtfHhtf = - 2 j dh j dh TjHhtfHhtf). (4.21) 

to to to to 
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Figure 4.1. Geometric interpretation of Eq. (4.21). 

The double integral on the right-hand side just counts everything twice, since 
in the iiio-plane, the integrand T{Hj(ti)Hj(t- 2 )} is symmetric about the line 
ti = t -2 (see Fig. 4.1). 

A similar identity holds for the higher terms: 

t t\ t n -l t 

2 - ■ ■ Idt„ Hj(ti) • • • i?/(t„) = J dt 1 • • • dt n T{Hj(ti) ■ ■ ■ Hj(t n )}. 

to to to to 

This case is a little harder to visualize, but it is not hard to convince oneself 
that it is true. Using this identity, we can now write U(t,to) in an extremely 
compact form: 

t o t 

U(t,t 0 ) = 1 + (~i)Jdti ffj(ti) + dt i dt, T{Ff / (i 1 ) J ff/(i 2 )} + ■ ■ • 

to to 

t 

= T jexp f j dt’Hjit 1 )] J, (4.22) 

to 

where the time-ordering of the exponential is just defined as the Taylor series 
with each term time-ordered. When we do real computations we will keep 
only the first few terms of the series; the time-ordered exponential is just a 
compact way of writing and remembering the correct expression. 

We now have control over <p(t,x); we have written it entirely in terms of 
< pj , as desired. Before moving on to consider |fi), however, it is convenient to 
generalize the definition of U, allowing its second argument to take on values 
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other than our “reference time” to- The correct definition is quite natural: 

t 

U(t,t') = T^exp^-i Jdt" j. (t>t') (4.23) 

v 

Several properties follow from this definition, and it is necessary to verify 
them. First, U(t,t') satisfies the same differential equation (4.18), 

i^jU(t,t') = H T (t)U(t,t!), (4.24) 

but now with the initial condition U = 1 for t = t'. From this equation you 
can show that 


U(t,t') = e iHo ^ t ~ to ' l e~ iH ^ t ~ t " , e~ iHo ^'~ to ' 1 , 


(4.25) 


which proves that U is unitary. Finally, U(t, t 1 ) satisfies the following identities 
(for ti > t 2 > t 3 ): 

U(hJ 2 )U(h,t 3 ) = /.'(/i,/ 3 ); 

+ (4.26) 

U(t u t 3 )[U(t 2 ,h)\=U(t u h). 

Now we can go on to discuss |0). Since |0) is the ground state of H, 
we can isolate it by the following procedure. Imagine starting with |0), the 
ground state of H 0 , and evolving through time with H: 

e-* HT \0) = Y / e-' E " T \n)(n\0), 

n 

where E n are the eigenvalues of H. We must assume that |fl) has some overlap 
with |0), that is, (fl|0) ^ 0 (if this were not the case, Hi would in no sense be 
a small perturbation). Then the above series contains |fl), and we can write 

e~ iHT |0) = e~ iEoT L>) (i> ll) + J2 e ~ iEnT l«) H°). 

where E 0 = (fl| H |fl). (The zero of energy will be defined by H 0 |0) = 0.) 
Since E n > E 0 for all n ^ 0, we can get rid of all the n ^ 0 terms in the series 
by sending T to oo in a slightly imaginary direction: T —> oo(l — ie). Then 
the exponential factor e~ tEllT dies slowest for n = 0, and we have 

10) = lim (e~ iE ° T (01 0))- 1 e~ iHT 10). (4.27) 

Since T is now very large, we can shift it by a small constant: 

10) = lim (e-iEoH+to) <OlO)rV ,:H(T+to) 10) 

T^oo(l-i€) V ' 

= lim (, ■ 'IV <o> ,,)) mu i ■! ) c W 3 ( 7 (,; ,„) 

T-^oo(l-i€) V 7 

( lini iU ' ~ { (O 0)f/■) o). (4.28) 
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In the second line we have used H 0 |0) = 0. Ignoring the c-number factor in 
front, this expression tells us that we can get |f2) by simply evolving |0) from 
time — T to time to with the operator U. Similarly, we can express (0| as 

<Q| = ^ lim ^ (0| U(T,t 0 ){e- iEo{T - to) (0| fi)) -1 . (4.29) 


Let us put together the pieces of the two-point correlation function. For 
the moment, assume that x° > y° > to. Then 

<fi| <Kx)<Ky) |0) = lim ( e -^o(T-to) (0 | n) )-i (0 | u(T to) 

T^oo(l-ie) v ' 

x [ U(x °, t 0 )] ] 0 Ax)U (x°,t 0 ) [U(y°, t 0 )] i (l> I (y)U(y 0 ,to) 
x U(t 0 , -T) |0) [e- iEo(t °- ( - T) '> (fi| 0)) _1 
= lim (|(0|fl) | 2 e“ i£o(2T) ) _1 

T^oo(l-i€) V ' 

x (0| U(T, x^cp.ix^t[x°, y°)(f> I (y)U{y°, -T) |0). (4.30) 

This is starting to look simple, except for the awkward factor in front. To get 
rid of it, divide by 1 in the form 

1 = (0| Q) = (|(0 |0) | 2 e -^o( 2 T))-i (Q | u(T to)u(u ^ _ T) | Q) 

Then our formula, still for ;r° > y°, becomes 


(0| 4 >{x)(j>{y) |fl) 


(0| U(T, x°)(p I (x)U(x°, y°)4> I (y)U(y°, —T) |0) 
(0| U(T, -T) |0) 


Now note that all fields on both sides of this expression are in time order. If 
we had considered the case y° > x° this would still be true. Thus we arrive 
at our final expression, now valid for any x° and y°: 


(n\T{ 0 (x) 0 (y)}\n) 


lim 

T—> oo (1- ie) 


(0\T^(t> I (x)<f> I (y) exp[-i J^ T dt. Hj(t)\ j |0) 
(0| T jexp[—i J^dtH^t) ]}|0) 


(4.31) 

The virtue of considering the time-ordered product is clear: It allows us to 
put everything inside one large T-operator. A similar formula holds for higher 
correlation functions of arbitrarily many fields; for each extra factor of on 
the left, put an extra factor of cp I on the right. So far this expression is exact. 
But it is ideally suited to doing perturbative calculations; we need only retain 
as many terms as desired in the Taylor series expansions of the exponentials. 
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4.3 Wick’s Theorem 


We have now reduced the problem of calculating correlation functions to that 
of evaluating expressions of the form 


<0| T{<f> I (x 1 )4> I (x2) ■ ■■<t> I (x n )} |0), 


that is, vacuum expectation values of time-ordered products of finite (but 
arbitrary) numbers of free field operators. For n = 2 this expression is just 
the Feynman propagator. For higher n you could evaluate this object by brute 
force, plugging in the expansion of <pj in terms of ladder operators. In this 
section and the next, however, we will see how to simplify such calculations 
immensely. 

Consider again the case of two fields, (0\T{(p I (x)(p I (y)} |0). We already 
know how to calculate this quantity, but now we would like to rewrite it in 
a form that is easy to evaluate and also generalizes to the case of more than 
two fields. To do this we first decompose (p z (x) into positive- and negative- 
frequency parts: 

(/>i(x) = <pj (x) + cpj (x), (4.32) 


where 

4>i (x) = 


d 3 p 


( 2 tt ) 3 sj2Rp 


0 — tP'X . 


4>i (x) = 


d 3 p 


1 


p+ip-x 

(27T)3^ fl P 6 ' 


This decomposition can be done for any free field. It is useful because 


<pj (x) |0) = 0 and (0| <pj (x) = 0. 

For example, consider the case x° > y°. The time-ordered product of two 
fields is then 


T(f> I (x)(p I (y) = <pj(x)cpj (y) + (x)<p I (y) + <j>j (x)<f>j(y) + <j>j (x)4> 1 (y) 

x u >y u 

= <t>j (x)<f>t (y) + (pj (y)<Pt (x) + <PJ (x)<Pt (y) + (pj (. x)<pj (y) 

+ [(pj(x),(PJ{y)]. (4.33) 

In every term except the commutator, all the a p ’s are to the right of all the 
a^’s. Such a term (e.g., a^a^a k a t ) is said to be in normal order, and has 
vanishing vacuum expectation value. Let us also define the normal ordering 
symbol N() to place whatever operators it contains in normal order, for ex¬ 
ample, 

J\ (o.pO.j^O.q) = a kCpOq . (4.34) 

The order of a v and a q on the right-hand side makes no difference since they 
commute.* 


*In tlie literature one often sees the notation instead of N(<pi(po). 
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If we had instead considered the case y° > x °, we would get the same four 
normal-ordered terms as in (4.33), but this time the final commutator would 
be [d>/ (y), <pj(x)]. Let us therefore define one more quantity, the contraction 
of two fields, as follows: 



[0 + (a;) ; ,0 (y)] 
[4>+(y),4>- (a:)] 


for ;r° > y°: 
for y° > x°. 


(4.35) 


This quantity is exactly the Feynman propagator: 


<t>(x)<t>(v) = D F (x- y). 


(4.36) 


(From here on we will often drop the subscript I for convenience; contractions 
will always involve interaction-picture fields.) 

The relation between time-ordering and normal-ordering is now extremely 
simple to express, at least for two fields: 

T {4>{x)4>(y)} = N{4>(x)4>(y) + nix)o<y)\. (4.37) 

But now that we have all this new notation, the generalization to arbitrarily 
many fields is also easy to write down: 

T {(j){xi)<j){x- 2 ) • • • #K m )} 

(4.38) 

= JV{< 7 >( xi)<p{x- 2 ) ■ ■ ■ <j>{x m ) + all possible contractions}. 

This identity is known as Wick’s theorem , and we will prove it in a moment. 
For m = 2 it is identical to (4.37). The phrase all possible contractions means 
there will be one term for each possible way of contracting the m fields in 
pairs. Thus for m = 4 we have (writing cj)(x a ) as 0 O for brevity) 


r~\ I-) _ I- 1 

T{01 020304} = A {01020304 + 01020304 + 01020304 + 01020304 

I I I I I I (4.39) 

+ 01 02 03 04 + 01 02 03 04 +01 02 03 04 

m m H - ! I I rn I 

+ 01 02 03 04 + 01 02 03 04 + 01 02 03 04 } • 

When the contraction symbol connects two operators that are not adjacent, 
we still define it to give a factor of Dp. For example, 

I-1 

A^|0i020304 } means Dp(x\ - x 3 ) ■ A7{0 2 04}. 

In the vacuum expectation value of (4.39), any term in which there remain 
uncontracted operators gives zero (since (0| N (any operator) |0) = 0). Only 
the three fully contracted terms in the last line survive, and they are all c- 
numbers, so we have 

(O|T{ 01020304 } | 0 ) = Dp(x! - X -2 )Dp(x 3 - X 4 ) 

+ D f (xi - x 3 )D F (x -2 - x 4 ) 

+ D f (x i - Xi)D F {x -2 - x 3 ). 


(4.40) 
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Now let us prove Wick’s theorem. Naturally the proof is by induction 
on m, the number of fields. We have already proved the case m = 2. So 
assume the theorem is true for to — 1 fields, and let’s try to prove it for 
to fields. Without loss of generality, we can restrict ourselves to the case 
x\ > .To > ■ ■ ■ ; if this is not the case we can just relabel the points, without 

affecting either side of (4.38). Then applying Wick’s theorem to (p- 2 - ■ ■ <p m , we 
have 


T {<pi • • • 4> m | — <p\ <p m 

r ( all contractions \ 

= hN{h •"A» + ( notinvolving ^J) 

= (<pf ( not J } • (4.41) 

We want to move the (pf and inside the N{}. For the (p} term this is easy: 
Just move it in, since (being on the left) it is already in normal order. The 
term with cpf must be put in normal order by commuting it to the right past 
all the other (/>’ s. Consider, for example, the term with no contractions: 

<Pl N {<j>2 ' ' ' <Pm ) = V ('>J • • • <Pm)<Pi + [<f>t,N(<f >2 ■ ■ ■ 4> m )] 

= N (<t>f<t>2 ' ■ ■ 4>m ) 

+ N {[$1 i 4>2 ]4>3 ‘ ‘ ' 9m + <p2 [4>t 3$0p4 - ' ‘ (pm H-) 

rn i-1 

= N{(pf <P‘2 (pm + (pl<p2(p3 ■ ■ ■ (pm + <pl<p2(p3 ••• + •••). 


The first term in the last line combines with part of the term from (4.41) to 
give N{(p\(p -2 ■ ■ ■ (p m }, so we now have the first term on the right-hand side of 
Wick’s theorem, as well as all possible terms involving a single contraction of 
<pi with another field. Similarly, a term in (4.41) involving one contraction will 
produce all possible terms involving both that contraction and a contraction 
of (pi with one of the other fields. Doing this with all the terms of (4.41), we 
eventually get all possible contractions of all the fields, including <pi. Thus the 
induction step is complete, and Wick’s theorem is proved. 

4.4 Feynman Diagrams 

Wick’s theorem allows us to turn any expression of the form 

< 0 | T{(p^x i)<Ma 2 ) ''' 4>i(x n )} |0) 

into a sum of products of Feynman propagators. Now we are ready to develop 
a diagrammatic interpretation of such expressions. Consider first the case of 
four fields, all at different spacetime points, which we worked out in Eq. (4.40). 
Let us represent each of the points Ti through x± by a dot, and each factor 
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Dp(x — y) by a line joining x to y. Then Eq. (4.40) can be represented as the 
sum of three diagrams (called Feynman diagrams)'. 


(0\T{<f,uhMi} 1°) = 


(4.42) 


Although this isn’t exactly a measurable quantity, the diagrams do suggest an 
interpretation: Particles are created at two spacetime points, each propagates 
to one of the other points, and then they are annihilated. This can happen in 
three ways, corresponding to the three ways to connect the points in pairs, as 
shown in the three diagrams. The total amplitude for the process is the sum 
of the three diagrams. 

Things get more interesting when the expression contains more than one 
field at the same spacetime point. So let us now return to the evaluation of 
the two-point function (fl\T{(j>(x)(j>(y)} |fi), and put formula (4.31) to use. We 
will ignore the denominator until the very end of this section. The numerator, 
with the exponential expanded as a power series, is 

(0| T^4>(x)4>(y) + o(./-)o(//)[ ijill ///(/)' + • • • j |0). (4.43) 

The first term gives the free-field result, (0| T{<p(x)(f>(y)} |0) = Dp(x-y). The 
second term, in (f> 4 theory, is 

<0| T{<j>(x)<j>(y) (~i)Jdtjd 3 z -^o 1 j |0) 

= (0| T{cf>{x)cf>{y)(-^') J d 4 Zcf>(z)cf>{z)cf>{z)cf>(z)} |0) . 

Now apply Wick’s theorem. We get one term for every way of contracting the 
six <f> operators with each other in pairs. There are 15 ways to do this, but 
(fortunately) only two of them are really different. If we contract <f>(x) with 
then there are three ways to contract the four cp{z)' s with each other, 
and all three give identical expressions. The other possibility is to contract 
<j>(x) with one of the <£( 2 )’s (four choices), <f>{y) with one of the others (three 
choices), and the remaining two 0 ( 2 )’s with each other (one choice). There 
are twelve ways to do this, and all give identical expressions. Thus we have 

(0| T^<p(x)<p(y) (-i) Jdt Jd 3 Z^4> 4 } |0) 

= 3 • jj—j Dp(x - y) Jd 4 z Dp(z - z)Dp(z - 2 ) (4.44) 

(-jp) jd 4 z D F (x - z)Dp(y - z)Dp(z - 2 ). 


+ 12 - 
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We can understand this expression better if we represent each term as a 
Feynman diagram. Again we draw each contraction Dp as a line, and each 
point as a dot. But this time we must distinguish between the “external” 
points, x and y, and the “internal” point each internal point is associated 
with a factor of ( — i\) f d 4 z. We will worry about the constant factors in a 
moment. Using these rules, we see that the above expression (4.44) is equal 
to the sum of two diagrams: 


We refer to the lines in these diagrams as propagators , since they represent the 
propagation amplitude Dp. Internal points, where four lines meet, are called 
vertices. Since Dp(x — y ) is the amplitude for a free Klein-Gordon particle 
to propagate between x and y, the diagrams actually interpret the analytic 
formula as a process of particle creation, propagation, and annihilation which 
takes place in spacetime. 

Now let’s try a more complicated contraction, from the A 3 term in the 
expansion of the correlation function: 


<0| <f>(x)4>(y) i(^) 3 fd 4 z fd 4 w 'w fd 4 u ## |0) 

= —y Id 4 z d 4 ir d 4 u Dp(x — z)Dp(z — z)Dp(z 

x Dp(w — y)Dp(w — u)Dp(u — u). 


w ) 

(4.45) 


The number of “different” contractions that give this same expression is large: 


x 

interchange 
of vertices 


4-3 


placement of 
contractions 
into 2 vertex 


4-3-2 


placement of 
contractions 
into w vertex 



placement of interchange 
contractions of w-u 

into u vertex contractions 


The product of these combinatoric factors is 10,368, roughly 1/13 of the total 
of 135,135 possible full contractions of the 14 operators. The structure of this 
particular contraction can be represented by the following “cactus” diagram: 


It is conventional, for obvious reasons, to let this one diagram represent the 
sum of all 10,368 identical terms. 
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In practice one always draws the diagram first, using it as a mnemonic 
device for writing down the analytic expression. But then the question arises, 
What is the overall constant? We could, of course, work it out as above: We 
could associate a factor f d 4 z(—i A/4!) with each vertex, put in the 1/nl from 
the Taylor series, and then do the combinatorics by writing out the product 
of fields as in (4.45) and counting. But the 1/nl from the Taylor series will 
almost always cancel the nl from interchanging the vertices, so we can just 
forget about both of these factors. Furthermore, the generic vertex has four 
lines coming in from four different places, so the various placements of these 
contractions into <jxjxp(j) generates a factor of 4! (as in the w vertex above), 
which cancels the denominator in (—i\/4l). It is therefore conventional to 
associate the expression f d 4 z(—iX) with each vertex. (This was the reason 
for the factor of 4! in the ( ft 4 coupling.) 

In the above diagram, this scheme gives a constant that is too large by 
a factor of 8 = 2 • 2 • 2, the symmetry factor of the diagram. Two factors 
of 2 come from lines that start and end on the same vertex: The diagram is 
symmetric under the interchange of the two ends of such a line. The other 
factor of 2 comes from the two propagators connecting w to u: The diagram is 
symmetric under the interchange of these two lines with each other. A third 
possible type of symmetry is the equivalence of two vertices. To get the correct 
overall constant for a diagram, we divide by its symmetry factor, which is in 
general the number of ways of interchanging components without changing 
the diagram. 

Most people never need to evaluate a diagram with a symmetry factor 
greater than 2, so there’s no need to worry too much about these technicalities. 
But here are a few examples, to make some sense out of the above rules: 


When in doubt, you can always determine the symmetry factor by counting 
equivalent contractions, as we did above. 

We are now ready to summarize our rules for calculating the numerator 
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of our expression (4.31) for (fl\T(j>(x)(j>{y) |S1): 


<0| T^<f> I (x)4> I (y) exp — ijdtH^t) j| |0) 


/ sum of all possible diagrams \ 
\ with two external points J 


where each diagram is built out of propagators, vertices, and external points. 
The rules for associating analytic expressions with pieces of diagrams are 
called the Feynman rules. In 0 A theory the rules are: 


1. For each propagator, 


= D f (x -y)] 


2. For each vertex, 



3. For each external point, 


= 1 ; 


4. Divide by the symmetry factor. 


One way to interpret these rules is to think of the vertex factor (— iX) as 
the amplitude for the emission and/or absorption of particles at a vertex. The 
integral f d 4 z instructs us to sum over all points where this process can occur. 
This is just the superposition principle of quantum mechanics: When a process 
can happen in alternative ways, we add the amplitudes for each possible way. 
To compute each individual amplitude, the Feynman rules tell us to inultiply 
the amplitudes (propagators and vertex factors) for each independent part of 
the process. 

Since these rules are written in terms of the spacetime points x, y, etc., 
they are sometimes called the position-space Feynman rules. In most calcu¬ 
lations, it is simpler to express the Feynman rules in terms of momenta, by- 
introducing the Fourier expansion of each propagator: 


D f (x - y) 


I" d A p i 

J (27r) 4 p 2 — m 2 + ie 


e -ip-{x-y) 


(4.46) 


Represent this in the diagram by assigning a 4-momentum p to each propa¬ 
gator, indicating the direction with an arrow. (Since D F (x — y ) = D F (y — x), 
the direction of p is arbitrary.) Then when four lines meet at a vertex, the 
^-dependent factors of the diagram are 


jd 4 z e~ ipiz e~ ip2z e~ ipsZ e +ip4,z 

= (27t)V 4, (pi + P2 + P3 Pa) • 


(4.47) 


In other words, momentum is conserved at each vertex. The delta functions 
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from the vertices can now be used to perforin some of the momentum inte¬ 
grals from the propagators. We are left with the following momentum-space 
Feynman rules: 

1 

1. For each propagator, = —----—; 

p- — m- + it 

2. For each vertex, = — i\; 


3. For each external point, = e 

4. Impose momentum conservation at each vertex; 

5. Integrate over each undetermined momentum: 

6. Divide by the symmetry factor. 

Again, we can interpret each factor as the amplitude for that part of the 
process, with the integrations coming from the superposition principle. The 
exponential factor for an external point is just the amplitude for a particle at 
that point to have the needed momentum, or, depending on the direction of 
the arrow, for a particle with a certain momentum to be found at that point. 

This nearly completes our discussion of the computation of correlation 
functions, but there are still a few loose ends. First, what happened to the 
large time T that was taken to oo(l — ie)? We glossed over it completely in 
this section, starting with Eq. (4.43). The place to put it back is Eq. (4.47), 
where instead of just integrating over d 4 z, we should have 

T 

-T 

The exponential blows up as z° —»■ oo or z° —» — oo unless its argument 
is purely imaginary. To achieve this, we can take each p° to have a small 
imaginary part: p° cx (1 + ie). But this is precisely what we do in following 
the Feynman boundary conditions for computing Dp: We integrate along a 
contour that is rotated slightly away from the real axis, so that p° oc (1 + ie): 
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The explicit dependence on T seems to disappear when we take the limit 
T —» oo in (4.46). But consider the diagram 


(4.48) 


The delta function for the left-hand vertex is (27r) 4 <5(pi + pi), so momentum 
conservation at the right-hand vertex is automatically satisfied, and we get 
(27r) 4 (5(0) there. This awkward factor is easy to understand by going back to 
position space. It is simply the integral of a constant over d 4 w: 

Jd A w (const) cx (2T) • (volume of space). (4.49) 

This just tells us that the spacetime process (4.48) can happen at any place 
in space, and at any time between — T and T. Every disconnected piece of a 
diagram, that is, every piece that is not connected to an external point, will 
have one such (27r) 4 d(0) = 2 T ■ V factor. 

The contributions to the correlation function coming from such diagrams 
can be better understood with the help of a very pretty identity, the exponen¬ 
tiation of the disconnected diagrams. It works as follows. A typical diagram 
has the form 


(4.50) 


with a piece connected to x and y, and several disconnected pieces. (Since each 
vertex has an even number of lines coming into it, x and y must be connected 
to each other.) Label the various possible disconnected pieces by V): 


V) e 


(4.51) 


The elements V) are connected internally, but disconnected from external 
points. Suppose that a diagram (such as (4.50)) has n,; pieces of the form 
Vi, for each i, in addition to its one piece that is connected to x and y. (In 
any given diagram, only finitely many of the ni will be nonzero.) If we also 
let Vi denote the value of the piece I), then the value of such a diagram is 


(value of connected piece) • n w ('•)'“• 


The 1/nj! is the symmetry factor coming from interchanging the n -, copies of 
Vi. The sum of all diagrams, representing the numerator of our formula for 
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the two-point correlation function, is then 


E E 

all possible all {n*} 
connected 
pieces 


value of 
connected piece 


nA(r.)' 


where “all {«,,}” means “all ordered sets {m, 712 , 713 ,...} of nonnegative inte¬ 
gers.” The sum of the connected pieces factors out of this expression, giving 

= (^connected) x ^ (il 

where connected) is an abbreviation for the sum of the values of all con¬ 
nected pieces. It is not too hard to see that the rest of the expression can also 
be factored (try working backwards): 

71 1 n2 71 3 

= (^connected) x J](E ^T') 

i m 1 ' 

= (E connected j x n exp(V,;) 

i 

= (E connected j x exp(y~) Vi). (4.52) 

i 

We have just shown that the sum of all diagrams is equal to the sum of 
all connected diagrams, times the exponential of the sum of all disconnected 
diagrams. (We should really say “pieces” rather than “diagrams” on the right- 
hand side of the equality, but from now on we will often just call a single piece 
a “diagram.”) Pictorially, the identity is 


1 

, €) ( 0 | T exp J <&#/(*)]} | 0 ) 


x exp 


(4.53) 


Now consider the denominator of our formula (4.31) for the two-point 
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function. By an argument identical to the above, it is just 

T 

(0| Tjexpj ^—i 

-T 

which cancels the exponential of the disconnected diagrams in the numerator. 
This is the final simplification of the formula, which now reads 

(n\ T[4>(x)4>(y)\ |fi) 

= sum of all connected diagrams with two external points 


I diffj(i)]} |0) = exp 


(4.54) 


We have come a long way from our original formula, Eq. (4.31). 

Having gotten rid of the disconnected diagrams in our formula for the 
correlation function, we might pause a moment to go back and interpret them 
physically. The place to look is Eq. (4.30), which can be written 

T Jim u) <0| T|^ r (a;)|i,(y) exp [~ifJ T dt #/(t)] } |0) 

= (n\Tct>(x)ct>(y)\n)- Um ({(Mi) 2 . 

T^oo(l-ie) 

Looking only at the T-dependent parts of both sides, this implies 

exp V);] ocexp[-TEo(2T)]. (4.55) 

i 

Since each disconnected diagram V); contains a factor of (27 t) 4 <5 (4) (0) = 2 T-V, 
this gives us a formula for the energy density of the vacuum (relative to the 
zero of energy set by H 0 |0) = 0): 

jY 2tt) 4 (5 (4 ) (0)]. (4.56) 


Eq 

volume 


We should emphasize that the right-hand side is independent of T and (vol¬ 
ume) ; in particular it is reassuring to see that E 0 is proportional to the volume 
of space. In Chapter 11 we will find that this formula is actually useful. 

This completes our present analysis of the two-point correlation function. 
The generalization to higher correlation functions is easy: 


, s (sum of all connected diagramsN ,, 

.t,)] |S!) = ( witl , „ extemal poi ,l )■ («7) 


The disconnected diagrams exponentiate, factor, and cancel as before, by the 
same argument. There is now a potential confusion in terminology, however. 
By “disconnected” we mean “disconnected from all external points”—exactly 
the same diagrams as in (4.51). (They are sometimes called “vacuum bubbles” 
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or “vacuum to vacuum transitions”.) In higher correlation functions, diagrams 
can also be disconnected in another sense. Consider, for example, the four- 
point function: 

(Vl\T<j)i<p2<pz<j),i |fi) 


(4.58) 


In many of these diagrams, external points are disconnected from each other. 
Such diagrams do not exponentiate or factor; they contribute to the amplitude 
just as do the fully connected diagrams (in which any point can be reached 
from any other by traveling along the lines). 

Note that in < f> 4 theory, all correlation functions of an odd number of fields 
vanish, since it is impossible to draw an allowed diagram with an odd number 
of external points. We can also see this by going back to Wick’s theorem: The 
interaction Hamiltonian Hi contains an even number of fields, so all terms 
in the perturbation expansion of an odd correlation function will contain an 
odd number of fields. But it is impossible to fully contract an odd number 
of fields in pairs, and only fully contracted terms have nonvanishing vacuum 
expectation value. 

4.5 Cross Sections and the S-Matrix 

We now have an extremely beautiful formula, Eq. (4.57), for computing an 
extremely abstract quantity: the n-point correlation function. Our next task 
is to find equally beautiful ways of computing quantities that can actually be 
measured: cross sections and decay rates. In this section, after briefly reviewing 
the definitions of these objects, we will relate them (via a rather technical but 
fairly careful derivation) to a more primitive quantity, the S-matrix. In the 
next section we will learn how to compute the matrix elements of the S'-matrix 
using Feynman diagrams. 

The Cross Section 

The experiments that probe the behavior of elementary particles, especially 
in the relativistic regime, are scattering experiments. One collides two beams 
of particles with well-defined momenta, and observes what comes out. The 
likelihood of any particular final state can be expressed in terms of the cross 
section , a quantity that is intrinsic to the colliding particles, and therefore 
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allows comparison of two different experiments with different beam sizes and 
intensities. 

The cross section is defined as follows. Consider a target, at rest, of parti¬ 
cles of type A, with density p .4 (particles per unit volume). Aim at this target 
a bunch of particles of type B , with number density ps and velocity v: 


Let £4 and 4 be the lengths of the bunches of particles. Then we expect 
the total number of scattering events (or scattering events of any particular 
desired type) to be proportional to p. 4 , ps, £ 4 , 4 , and the cross-sectional 
area A common to the two bunches. The cross section , denoted by a, is just 
the total number of events (of whatever type desired) divided by all of these 
quantities: 


Number of scattering events 
PA £4 Pb 4 A 


(4.59) 


The definition is symmetric between the 4’s and £>’s, so of course we could 
have taken the B’s to be at rest, or worked in any other reference frame. 

The cross section has units of area. In fact, it is the effective area of 
a chunk taken out of one beam, by each particle in the other beam, that 
subsequently becomes the final state we are interested in. 

In real beams, p A and ps are not constant; the particle density is generally 
larger at the center of the beam than at the edges. We will assume, however, 
that both the range of the interaction between the particles and the width of 
the individual particle wavepackets are small compared to the beam diameter. 
We can then consider p A and ps to be constant in what follows, and remember 
that, to compute the event rate in an actual accelerator, one must integrate 
over the beam area: 


Number of events = a £4 4 


jcfx p A (x) pb(x). 


(4.60) 


If the densities are constant, or if we use this formula to compute an effective 
area A of the beams, then we have simply 


Number of events = (4.61) 

where N .4 and Njg are the total numbers of A and B particles. 

Cross sections for many different processes may be relevant to a single 
scattering experiment. In e + e _ collisions, for example, one can measure the 
cross sections for production of p + p~, t + t~, p + p _ 7 , p + p~ 77 , etc., and 
countless processes involving hadron production, not to mention simple e + e“ 
scattering. Usually, of course, we wish to measure not only what the final-state 
particles are, but also the momenta with which they come out. In this case 
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our definition (4.59) of a still works, but if we specify the exact momenta de¬ 
sired, a will be infinitesimal. The solution is to define the differential cross 
section, da / (d 3 pi ■ ■ ■ d 3 p n ). It is simply the quantity that, when integrated 
over any small d 3 p\ ■ ■ ■ d 3 p „, gives the cross section for scattering into that re¬ 
gion of final-state momentum space. The various final-state momenta are not 
all independent: Four components will always be constrained by 4-momentum 
conservation. In the simplest case, where there are only two final-state parti¬ 
cles, this leaves only two unconstrained momentum components, usually taken 
to be the angles 0 and tj> of the momentum of one of the particles. Integrating 
da/(cPpicPpo) over the four constrained momentum components then leaves 
us with the usual differential cross section da/dfl. 

A somewhat simpler measurable quantity is the decay rate T of an unsta¬ 
ble particle A (assumed to be at rest) into a specified final state (of two or 
more particles). It is defined as 


Number of decays per unit time 
Number of A particles present 


(4.62) 


The lifetime r of the particle is then the reciprocal of the sum of its decay 
rates into all possible final states. (The particle’s half-life is r • In 2.) 

In nonrelativistic quantum mechanics, an unstable atomic state shows up 
in scattering experiments as a resonance. Near the resonance energy Eq, the 
scattering amplitude is given by the Breit-Wigner formula 


f(E)<x 


1 

E — Eq H- iY /2 


(4.63) 


The cross section therefore has a peak of the form 


1 

(E-E 0 ) 2 +T 2 /4' 


The width of the resonance peak is equal to the decay rate of the unstable 
state. 

The Breit-Wigner formula (4.63) also applies in relativistic quantum me¬ 
chanics. In particular, it gives the scattering amplitude for processes in which 
initial particles combine to form an unstable particle, which then decays. The 
unstable particle, viewed as an excited state of the vacuum, is a direct ana¬ 
logue of the unstable nonrelativistic atomic state. If we call the 4-momentum 
of the unstable particle p and its mass m , we can make a relativisticallv in¬ 
variant generalization of (4.63): 


1 

p 2 — to 2 + imT 


1 

2£ p ( 1J o - £p + i(m/E p )T/2) ' 


(4.64) 


The decay rate of the unstable particle in a general frame is (m/E p )T, in ac¬ 
cord with relativistic time dilation. Although the two expressions in (4.64) are 
equal in the vicinity of the resonance, the left-hand side, which is manifestly 
Lorentz invariant, is much more convenient. 
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The S-Matrix 

How, then, do we calculate a cross section? We must set up wavepackets rep¬ 
resenting the initial-state particles, evolve this initial state for a very long 
time with the time-evolution operator exp(— iHt) of the interacting field the¬ 
ory, and overlap the resulting final state with wavepackets representing some 
desired set of final-state particles. This gives the probability amplitude for 
producing that final state, which is simply related to the cross section. We 
will find that, in the limit where the wavepackets are very narrow in momen¬ 
tum space, the amplitude depends only on the momenta of the wavepackets, 
not on the details of their shapes. 1 ' 

A wavepacket representing some desired state | <p) can be expressed as 

/ rl 3 k 1 

WF=« k)|k> ' (4.65) 

where <j>( k) is the Fourier transform of the spatial wavefunction, and |k) is a 
one-particle state of momentum k in the interacting theory. In the free theory, 
we would have |k) = y/TE^a^ |0). The factor of y/2E^ converts our relativistic 
normalization of |k) to the conventional normalization in which the sum of 
all probabilities adds up to 1: 

(<m = i if = ( 4 - 66 ) 

The probability we wish to compute is then 

V= |<^2_;|^b)| 2 , (4.67) 

future past 

where \0a<Pb) is a. state of two wavepackets constructed in the far past and 
{(j>i<j >-2 ■ ■ - \ is a state of several wavepackets (one for each final-state particle) 
constructed in the far future. The wavepackets are localized in space, so each 
can be constructed independently of the others. States constructed in this 
way are called in and out states. Note that we use the Heisenberg picture: 
States are time-independent, but the name we give a state depends on the 
eigenvalues or expectation values of time-dependent operators. Thus states 
with different names constructed at different times have a nontrivial overlap, 
which depends on the time dependence of the operators. 

If we set up | <Pa9b) in the remote past, and then take the limit in which 
the wavepackets d» ! :(k,:) become concentrated about definite momenta p,. this 
defines an in state |p_ 4 Ps) in with definite initial momenta. It is useful to view 
\<Pa<Pb) as a linear superposition of such states. It is important, however, to 

iMuch of this section is based on the treatment of nonrelativistic scattering given 
in Taylor (1972), Chapters 2, 3, and 17. We concentrate on the additional complications 
of the relativistic theory, glossing over many subtleties, common to both cases, which 
Taylor explains carefully. 
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Figure 4.2. Incident wavepackets are uniformly distributed in impact pa¬ 
rameter b. 


take into account the transverse displacement of the wavepacket 4>b relative 
to cj )_4 in position space (see Fig. 4.2). Although we could leave this implicit 
in the form of 0b(^b), we instead adopt the convention that our reference 
momentum-space wavefunctions are collinear (that is, have impact parameter 
b = 0 ), and write <^g(ks) with an explicit factor exp( —ib-kg) to account for 
the spatial translation. Then, since (f> A and 0b are constructed independently 
at different locations, we can write the initial state as 


\<t>A<t>B) in = 


d 3 k A f d 3 k B ^(k^)^ B (k B )e 


fd 3 k A f 

JWfJ 


-ib-kg 


(2tt ) 3 ^(2E a )( 2E b ) 


|k^k B ) h - 


(4.68) 


We could expand <£>2 • • • | in terms of similarly defined out states of definite 
momentum formed in the asymptotic future:+ 


t{0102 


TT f d 3 Pf <MP f) \ 

yj (2tt ) 3 s/2E~f )' 


t(PlP2 ' 


It is much easier, however, to use the out states of definite momentum as 
the final states in the probability amplitude (4.67), and to multiply by the 
various normalization factors after squaring the amplitude. This is physically 
reasonable as long as the detectors of final-state particles mainly measure 
momentum—that is, they do not resolve positions at the level of de Broglie 
wavelengths. 

We can now relate the probability of scattering in a real experiment to 
an idealized set of transition amplitudes between the asymptotically defined 
in and out states of definite momentum, 


out(PiP2 • • • |k4k B > in . (4.69) 


+Here and below, tlie product symbol applies (symbolically) to the integral as 
well as the other factors in parentheses; the integrals apply to what is outside the 
parentheses as well. 
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To compute the overlap of in states with out states, we note that the conven¬ 
tions for defining the two sets of states are related by time translation: 


out(PiP2 ''' |k^k B ) in = Jm {p lP2 ••• | kgkg) 
T —>-oo • ' ' 


-T 


= lim (pip 2 • • • | e ,H{2T) (kgkg). 

T—too 


(4.70) 


In the last line, the states are defined at any common reference time. Thus, the 
in and out states are related by the limit of a sequence of unitary operators. 
This limiting unitary operator is called the S-matrix: 


out<PiP2 • • • |k^k B ) in = <pip 2 • • • I 5 Ikgkg). (4.71) 

The 5-matrix has the following structure: If the particles in question 
do not interact at all, 5 is simply the identity operator. Even if the theory 
contains interactions, the particles have some probability of simply missing 
one another. To isolate the interesting part of the 5-matrix—that is, the part 
due to interactions—we define the T-matrix by 

5 = 1+ iT. (4.72) 

Next we note that the matrix elements of 5 should reflect 4-momentum con¬ 
servation. Thus 5 or T should always contain a factor 8 (i \k A + kg — ^Pf)- 
Extracting this factor, we define the invariant matrix element M, by 

(P 1 P 2 ■ ■ - \iT |k_ 4 k B ) = ( 27 t)V 4) (k A +k B -J^Pf) -iM(k A ,k B -+ p f ). (4.73) 

We have written this expression in terms of 4-momenta p and k, but of course 
all 4-momenta are on mass-shell: p° = E p , k° = E^. (Note that our entire 
treatment is specific to the case where the initial state contains only two 
particles. For 3—»many or many—»many interactions, one can invent analogous 
constructions, but we will not consider such complicated experiments in this 
book.) 

The matrix element M is analogous to the scattering amplitude / of 
one-particle quantum mechanics. It is useful because it allows us to separate 
all the physics that depends on the details of the interaction Hamiltonian 
(“dynamics”) from all the physics that doesn’t (“kinematics”). In the next 
section we will discuss how to compute M using Feynman diagrams. But 
first, we must figure out how to reconstruct the cross section a from M. 

To do this, let us calculate, in terms of M, the probability for the initial 
state \4>a4>b) to scatter and become a final state of n particles whose momenta 
lie in a small region d 3 pi ■ ■ -dPp n . In our normalization, this probability is 

V(AB -> 12 ...n)= (n^^^)|out(Pi---P« | MsU- (4.74) 
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For a single target (A) particle and many incident (B) particles with different 
impact parameters b, the number of scattering events is 

N= Vi = Jc?bn B V( b), 

all incident 
particles i 


where n B is the number density (particles per unit area) of B particles. Since 
we are assuming that this number density is constant over the range of the 
interaction, n B can be taken outside the integral. The cross section is then 

<7 = = [ dH V (h)- ( 4 - 75 ) 

n B N A n B ■ 1 J 

Deriving a simple expression for a in terms of M is now a fairly straight¬ 
forward calculation. Combining (4.75), (4.74), and (4.68), we have (writing 
da rather than a since this is an infinitesimal quantity) 


da 


( u d 3 Pf i \ r, f n r d 3 k t &(kj) r d'-r, : 

V/(27t) 3 2Ef)J ' (2tt) 3 J (2tt) 3 y/2E~J 

x e <b.(k B -k B )( out< {p / }|{k.} )in )( out< {p / }|{k.} )in )* i (4.76) 


where we have used k A and k B as dummy integration variables in the second 
half of the squared amplitude. The d 2 b integral can be performed to give a 
factor of (2it)' 2 (kg — kg). We get more delta functions by writing the final 
two factors of (4.76) in terms of M. Assuming that we are not interested in 
the trivial case of forward scattering where no interaction takes place, we can 
drop the 1 in Eq. (4.72) and write these factors as 


(out({P/}|{k,:}) in ) = iM({k.,} {p f }) (2nyS^(Z h ~ 

(out({P/}|{k*})in)* = -iM*({h} -)■ {p f }) (27T y l S^(Zh ~ EPf)- 


We can use the second of these delta functions, together with the (5 (2) (kg—kg), 
to perform all six of the k integrals in (4.76). Of the six integrals, only those 
over k^ and k z B require some work: 


Jdk\dkg S(k^ A +ki - YjP)) HE A +Eg - E E f ) 

= J d k Z A^{\J k 2 A +m 2 A + \J k 2 g+m 2 g - Y. E f ) 


1 

l 

h z h z 

'M _ 

” V’A ~ VB 

E a Eg 



In the last line and in the rest of Eq. (4.76) it is understood that the con¬ 
straints k^ + k z B = EP/ an d E a + Eg = E Ef now apply (in addition to 
the constraints k A = k A and kg = kg coming from the other four integrals). 
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The difference \v A — vb | is the relative velocity of the beams as viewed from 
the laboratory frame. 

Now recall that the initial wavepackets are localized in momentum space, 
centered on p _4 and p^. This means that we can evaluate all factors that 
are smooth functions of k A and at p .4 and p b, pulling them outside the 
integrals. These factors include E A , Eg, \v A — v B \, and M —everything except 
the remaining delta function. After doing this, we arrive at the expression 

l = fy, d 3 Pf 1 \ | M(pa,Pb {Pf}) |" f d 3 k A I" d 3 k B 

a ~ [}} (2ir) 3 2E f ) 2E A 2E B \v A -v B \ J (2tt) 3 J (2tt) 3 (4.78) 

x |^(k^)|“|d>s(kis)|”(27r) 4 (5 (4) (k A +k B — J2'Pf)- 

To simplify this formula further, we should think a bit more about the 
properties of real particle detectors. We have already noted that real detec¬ 
tors project mainly onto eigenstates of momentum. But real detectors have 
finite resolution; that is, they sum incoherently over momentum bites of fi¬ 
nite size. Normally, the measurement of the final-state momentum is not of 
such high quality that it can resolve the small variation of this momentum 
that results from the momentum spread of the initial wavepackets <j> A , 4 >b- In 
that case, we may treat even the momentum vector k A + kg inside the delta 
function as being well approximated by its central value p _4 + pjs- With this 
further approximation, we can perform the integrals over k A and kg using the 
normalization condition (4.66). This produces the final form of the relation 
between 5-matrix elements and cross sections, 


da = 


1 


n 


d 3 pf 1 


2E a 2E b \v a -v b \ V / (2tt) 3 2 E f / 

x | M(p a ,pb {p/})|" (27 t) 4 <5 (4) (p A +p B - T,Pf)- 

All dependence on the shapes of the wavepackets has disappeared. 
The integral over final-state momenta in (4.79) has the structure 

"Pf 1 


(4.79) 


[dl I„= (U I 

J \fJ 


d 3 - 


/ J (2tt) 3 2 E f 


(2 ^(P-EPfh 


(4.80) 


with P the total initial 4-momentum. This integral is manifestly Lorentz in¬ 
variant, since it is built up from invariant 3-momentum integrals constrained 
by a 4-momentum delta function. This integral is known as relativistically 
invariant n-body phase space. Of the other ingredients in (4.79), the matrix 
element M is also Lorentz invariant. The Lorentz transformation property of 
(4.79) therefore comes entirely from the prefactor 

1 _ 1 _ i 

E A Eb I V A - V B | \E B p z A - E A p z B | | ejixyvP^PB I ’ 

This is not Lorentz invariant, but it is invariant to boosts along the 2 -direction. 
In fact, this expression has exactly the transformation properties of a cross- 
sectional area. 
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For the special case of two particles in the final state, we can simplify 
the general expression (4.79) by partially evaluating the phase-space integrals 
in the center-of-mass frame. Label the momenta of the two final particles 
pi and p- 2 - We first choose to integrate all three components of P 2 over the 
delta functions enforcing 3-momentum conservation. This sets po = —pi and 
converts the integral over two-body phase space to the form 

/* = I (00k (2 ’ r) ' i(i5 » - E ‘ - El) ’ {iM) 

where E\ = \/p\ + mj, E- 2 = \Jp\ + Too, and E cm is the total initial energy. 
Integrating over the final delta function gives 




Pi (Pi 

16tt 2 EiE 2 \Ei + E 2 ) 

1 |Pi| 

167T 2 E cm • 


(4.82) 


For reactions symmetric 
be written simply as an 
frame: 


about the collision axis, two-body phase space can 
integral over the polar angle in the center-of-mass 



j dcosO 


1 2 | Pl | 

16tt E cm ' 


(4.83) 


The last factor tends to 1 at high energy. 

Applying this simplification to (4.79), we find the following form of the 
cross section for two final-state particles: 


da \ 
dfl ) c 


1 


iPil 


M(pa,Pb Pi,Pi) (4.84) 


' CM 2Ej\2Eb \va ~vb I (2tt) 2 AE cm 

In the special case where all four particles have identical masses (including the 
commonly seen limit m —)■ 0), this reduces to the formula quoted in Chapter 1, 

|.M | 2 


da \ 

In) 


CM 


647T 2 E 2 


(all four masses identical). 


(4.85) 


To conclude this section, we should derive a formula for the differential 
decay rate, dE, in terms of M. The correct expression is only a slight modifi¬ 
cation of (4.79), and is quite easy to guess: Just remove from (4.79) the factors 
that do not make sense when the initial state consists of a single particle. The 
definition of T assumes that the decaying particle is at rest, so the normaliza¬ 
tion factor {2Ea)~ 1 becomes (2m^) _1 . (In any other frame, this factor would 
give the usual time dilation.) Thus the decay rate formula is 


dr = 


l 

2 niA 



(Ppf 1 \ 
(2?r) 3 2 Ef) 


M(m A {p f }) |" (2n) 4 S (i) (p A - J^Pf)- 


(4.86) 


Unfortunately, the meaning of this formula is far from clear. Since an unstable 
particle cannot be sent into the infinitely distant past, our definition (4.73) 
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of M(my[ —I {pf}) in terms of the 5-matrix makes no sense in this context. 
Nevertheless formula (4.86) is correct, when M is computed according to the 
Feynman rules for 5-matrix elements that we will present in the following 
section. We postpone the further discussion of these matters, and the proof 
of Eq. (4.86), until Section 7.3. Until then, an intuitive notion of M as a 
transition amplitude should suffice. 

Equations (4.79) and (4.86) are completely general, whether or not the 
final state contains several identical particles. (The computation of M, of 
course, will be quite different when identical particles are present, but that is 
another matter.) When integrating either of these formulae to obtain a total 
cross section or decay rate, however, we must be careful to avoid counting the 
same final state several times. If there are n identical particles in the final 
state, we must either restrict the integration to inequivalent configurations, 
or divide by n\ after integrating over all sets of momenta. 

4.6 Computing S-Matrix Elements 
from Feynman Diagrams 

Now that we have formulae for cross sections and decay rates in terms of 
the invariant matrix element M, the only remaining task is to find a way of 
computing Ad for various processes in various interacting field theories. In this 
section we will write down (and try to motivate) a formula for M in terms 
of Feynman diagrams. We postpone the actual proof of this formula until 
Section 7.2, since the proof is somewhat technical and will be much easier to 
understand after we have seen how the formula is used. 

Recall from its definition, Eq. (4.71), that the 5-matrix is simply the 
time-evolution operator, exp(— iHt), in the limit of very large t: 

(P 1 P 2 • • 5 |k.4k B ) = lim (pipo • • -| e ~‘ H{2T) (kqks). (4.87) 

T—too 

To compute this quantity we would like to replace the external plane-wave 
states in (4.87), which are eigenstates of H, with their counterparts in the 
unperturbed theory, which are eigenstates of H 0 . We successfully made such 
a replacement for the vacuum state |0) in Eq. (4.27): 

S>) lim (e~ iE ° T (fi| O)) 1 e~ iHT |0). 

This time we would like to find a relation of the form 

Ikqke) cx lim e _,HT |k^k B ) 0 , (4.88) 

T >co(l—ie) 

where we have omitted some unknown phases and overlap factors like those 
in (4.27). To find such a relation would not be easy. In (4.27), we used the fact 
that the vacuum was the state of absolute lowest energy. Here we can use only 
the much weaker statement that the external states with well-separated initial 
and final particles have the lowest energy consistent with the predetermined 
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nonzero values of momentum. The problem is a deep one, and it is associated 
with one of the most fundamental difficulties of field theory, that interactions 
affect not only the scattering of distinct particles but also the form of the 
single-particle states themselves. 

If the formula (4.88) could somehow be justified, we could use it to rewrite 
the right-hand side of (4.87) as 


lim 0 (pi • • • p n \e tH(2T) |p.4Pb)o 

oo (1— ie) 


OC T _hm 0 {pi ■ ■ ■ Pn\T (^exp^-i j dt #/(£)] j |p_4Pb) 0 - 


(4.89) 


In the evaluation of vacuum expectation values, the awkward proportionality 
factors between free and interacting vacuum states cancelled out of the final 
formula, Eq. (4.31). In the present case those factors are so horrible that we 
have not even attempted to write them down; we only hope that a similar 
dramatic cancellation will take place here. In fact such a cancellation does 
take place, although it is not easy to derive this conclusion from our present 
approach. Up to one small modification (which is unimportant for our present 
purposes), the formula for the nontrivial part of the 5-matrix can be simplified 
to the following form: 


(Pi ■■■Pn\iT\p A p B ) 



^exp -ij dtH T (t) ^ |p_4Ps)o 


connected, 

amputated 


(4.90) 


The attributes “connected” and “amputated” refer to restrictions on the class 
of possible Feynman diagrams; these terms will be defined in a moment. We 
will prove Eq. (4.90) in Section 7.2. In the remainder of this section, we will 
explain this formula and motivate the new restrictions that we have added. 

First we must learn how to represent the matrix element in (4.90) as a 
sum of Feynman diagrams. Let us evaluate the first few terms explicitly, in 
<p 4 theory, for the case of two particles in the final state. The first term is 


o(piP2|p.4Pb)o = \J2Ei2E 2 2E a 2E b (0| aia.oo^ajj |0) 

= 2E A 2E B (2ir)° (s(p A - Pi)<5( P b - p 2 ) (4.91) 

+ <5(P.4 -P2)<5(Pb - Pi))- 


The delta functions force the final state to be identical to the initial state, 
so this term is part of the ‘1’ in 5 = 1 + iT, and does not contribute to the 
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scattering matrix element M. We can represent it diagrammatically as 


The next term in (pipaj 5 |p.4Ps) is 
0<PiP2|7 1 (-'i^ j rf 4 ;c0/(;c)j|p^p B ) o 

= 0 (pip 2 |A r (-i-^ j d 4 x(t>j(x ) + contractions j | p^p B ) 0 , 


(4.92) 


using Wick’s theorem. Since the external states are not |0), terms that are not 
fully contracted do not necessarily vanish; we can use an annihilation operator 
from 4>i ( x ) t° annihilate an initial-state particle, or a creation operator from 
to produce a final-state particle. For example, 


/ r ;3 1 . i 

w vm ake ~ ik * ^ a p |0> 

/ r ] 3 b 1 __ 

= e~ ip ' x |0) . 


(4.93) 


An uncontracted (j) I operator inside the A’-product of (4.92) has two terms: 

on the far right and cpj °n the far left. We get one contribution to the 
5-matrix element for each way of commuting the a of cpf past an initial-state 
cd, and one contribution for each way of commuting the ed of cpj past a final- 
state a. It is natural, then, to define the contractions of field operators with 
external states as follows: 


o : ix) p) = e~ ip ' x ; {pliAx) = e + ‘ P ' x - (4-94) 

To evaluate an 5-matrix element such as (4.92), we simply write down all 
possible full contractions of the operators and the external-state momenta. 

To see that this prescription is correct, let us evaluate (4.92) in detail. 
The A r -product contains terms of the form 

n n , n 

<p<p<t><p\ <p<p<p(l)] 04>4 , 4 > - (4.95) 

The last term, in which the <j) operators are fully contracted with each other, is 
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equal to a vacuum bubble diagram times the value of (4.91) calculated above: 



r n n 

/ d j-,,;])!])? noon p_ 4 p B ) 0 


(4.96) 


This is just another contribution to the trivial part of the 5-matrix, so we 
ignore it. 

Next consider the second term of (4.95), in which two of the four 0 oper¬ 
ators are contracted. The normal-ordered product of the remaining two fields 
looks like {afaf +2 cifa + aa). As we commute these operators past the a’s and 
ad’s of the initial and final states, we find that only a term with an equal num¬ 
ber of a’s and ad’s can survive. In the language of contractions, this says that 
one of the <p’s must be contracted with an initial-state |p), the other with a 
final-state (p|. The uncontracted |p) and (p| give a delta function as in (4.91). 
To represent these quantities diagrammatically, we introduce external lines to 
our Feynman rules: 

(p| <t>M) = ( 4 -97) 

Feynman diagrams for 5-matrix elements will always contain external lines, 
rather than the external points of diagrams for correlation functions. The 
second term of (4.95) thus yields four diagrams: 


4>Ax) Ip) = 


The integration f d 4 x produces a momentum-conserving delta function at 
each vertex (including the external momenta), so these diagrams again de¬ 
scribe trivial processes in which the initial and final states are identical. This 
illustrates a general principle: Only fully connected diagrams, in which all 
external lines are connected to each other, contribute to the T-matrix. 

Finally, consider the term of (4.95) in which none of the 0 operators are 
contracted with each other. Our prescription tells us to contract two of the 
0 's with |p. 4 Ps) and the other two with (pip 2 |. There are 4! ways to do this. 
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Thus we obtain the diagram 



—Upa+pb—p 


(4.98) 


= —i\ (27r) 4 <5 (4) (p A + p B - pi - p 2 ). 


This is exactly of the form iM(2x) 4 S (i ^ (p A + pb — p\ — p 2 ), with M = —A. 

Before continuing our discussion of Feynman diagrams for 5-matrix ele¬ 
ments, we should certainly pause to turn this result into a cross section. For 
scattering in the center-of-mass frame, we can simply plug |yVf | 2 = A 2 into 
Eq. (4.85) to obtain 


We have just computed our first quantum field theory cross section. It is a 
rather dull result, having no angular dependence at all. (This situation will 
be remedied when we consider fermions in the next section.) Integrating over 
dfl, and dividing by 2 since there are two identical particles in the final state, 
we find the total cross section, 


da A 

f ^/CM 


647T 2 E 2 


(4.99) 


^total 


A 2 

32nE 2 m 


(4.100) 


In practice, one would probably use this result to measure the value of A. 

Returning to our general discussion, let us consider some higher-order 
contributions to the T-matrix for the process A, B —> 1, 2. If we ignore, for the 
moment, the “connected and amputated” prescription, we have the formula 


? 

{P1P2 1 iT |p^Pb) = 


(4.101) 


plus diagrams in which the four external lines are not all connected to each 
other. We have already seen that this last class of diagrams gives no contribu¬ 
tion to the T-matrix. The first diagram shown in (4.101) gives the lowest-order 
contribution to T, which we calculated above. The next three diagrams give 
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expected corrections to this amplitude, involving creation and annihilation of 
additional “virtual” particles. 

The diagrams in the second line of (4.101) contain disconnected “vacuum 
bubbles”. By the same argument as at the end of Section 4.4, the disconnected 
pieces exponentiate to an overall phase factor giving the shift of the energy 
of the interacting vacuum state upon which the scattering takes place. Thus 
they are irrelevant to S. We have now seen that only fully connected diagrams 
give sensible contributions to 5-matrix elements. 

The last diagram is more problematical; let us evaluate it. After integrat¬ 
ing over the two vertex positions, we obtain 

- I f dAp ' 1 [ 1 

2 J (2ir ) 4 p' 2 — m 2 J (27t) 4 k 2 — m 2 
x {-iX){2Tr) 4 S (4) {p A + p - pi - p 2 ) ( 4 - 102 ) 

x (— iA)( 27 r) 4 < 5 ( 4 ) (p B - p 1 ). 


We can integrate over p' using the second delta function. It tells us to evaluate 


p 12 — m 2 


P ,= Pb 


Pb 


1 

o' 


We get infinity, since p B , being the momentum of an external particle, is on- 
shell: p 2 B = to 2 . This is a disaster. Clearly, our formula for 5 makes sense only 
if we exclude diagrams of this form, that is, diagrams with loops connected to 
only one external leg. Fortunately, this is physically reasonable: In the same 
way that the vacuum bubble diagrams represent the evolution of |0) into |fl), 
these external leg corrections, 


represent the evolution of |p) 0 into |p), the single-particle state of the inter¬ 
acting theory. Since these corrections have nothing to do with the scattering 
process, we should exclude them from the computation of 5. 

For a general diagram with external legs, we define amputation in the 
following way. Starting from the tip of each external leg, find the last point 
at which the diagram can be cut by removing a single propagator, such that 
this operation separates the leg from the rest of the diagram. Cut there. For 
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Let us summarize our prescription for calculating scattering amplitudes. 
Our formula for 5-matrix elements, Eq. (4.90), can be rewritten 


iM ■ ( 27 T ) 4 <5 ( 4 ) (pa +p B - Y.Pf) 

/ sum of all connected, amputated Feynman 
\ diagrams with p 4 , p F incoming, pf outgoing 


(4.103) 


By ‘connected’, we now mean fully connected, that is, with no vacuum bub¬ 
bles, and all external legs connected to each other. The Feynman rules for 
scattering amplitudes in (p A theory are, in position space, 


1. For each propagator, 


= D f (x -y)] 


2. For each vertex, 



3. For each external line, =e tp ' x ; 

4. Divide by the symmetry factor. 

Notice that the factor for an ingoing line is just the amplitude for that particle 
to be found at the vertex it connects to, i.e., the particle’s wavefunction. Sim¬ 
ilarly, the factor for an outgoing line is the amplitude for a particle produced 
at the vertex to have the desired final momentum. 

Just as with the Feynman rules for correlation functions, it is usually 
simpler to introduce the momentum-space representation of the propagators, 
carry out the vertex integrals to obtain momentum-conserving delta functions, 
and use these delta functions to evaluate as many momentum integrals as 
possible. In a scattering amplitude, however, there will always be an overall 
delta function, which can be used to cancel the one on the left-hand side of 
Eq. (4.103). We are then left with 


iM = sum of all connected, amputated diagrams, 
where the diagrams are evaluated according to the following rules: 


(4.104) 
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1. For each propagator, 


i 

p 2 — m 2 + if' 


2. For each vertex, 


= — iX; 


3. For each external line, = 1; 

4. Impose momentum conservation at each vertex; 

5. Integrate over each undetermined loop momentum: 

6. Divide by the symmetry factor. 

This is our final version of the Feynman rules for <p 4 theory; these rules are 
also listed in the Appendix, for reference. 

Actually, Eq. (4.103) still isn’t quite correct. One more modification is nec¬ 
essary, involving the proportionality factors that were omitted from Eq. (4.89). 
But the modification affects only diagrams containing loops, so we postpone 
its discussion until Chapters 6 and 7, where we first evaluate such diagrams. 
We will prove the corrected formula (4.103) in Section 7.2, by relating S- 
matrix elements to correlation functions, for which we have actually derived 
a formula in terms of Feynman diagrams. 
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So far in this chapter we have discussed only d> 4 theory, in order to avoid un¬ 
necessary complication. We are now ready to generalize our results to theories 
containing fermions. 

Our treatment of correlation functions in Section 4.2 generalizes without 
difficulty. Lorentz invariance requires that the interaction Hamiltionian Hi be 
a product of an even number of spinor fields, so no difficulties arise in defining 
the time-ordered exponential of Hi. 

To apply Wick’s theorem, however, we must generalize the definitions of 
the time-ordering and normal-ordering symbols to include fermions. We saw 
at the end of Section 3.5 that the time-ordering operator T acting on two 
spinor fields is most conveniently defined with an additional minus sign: 


T 


(ip{x)i/j{y)) = | 


r(.r)l-(u) 

~^{y)ip{x) 


for x° > y°; 
for x° < y°. 


(4.105) 


With this definition, the Feynman propagator for the Dirac field is 

Sf(x ~y) = [ 0 i(j/+ ™) . < i ’- i r " = <0| mx)i>(y) 0). (4.106) 

J (27t) 4 p~ — m- + ie 
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For products of more than two spinor fields, we generalize this definition in 
the natural way: The time-ordered product picks up one minus sign for each 
interchange of operators that is necessary to put the fields in time order. For 
example, 

T = (-1 flp^\ 1 pAlp 2 if > x° > x° > x°- 

The definition of the normal-ordered product of spinor fields is analogous: 
Put in an extra minus sign for each fermion interchange. The anticommutation 
properties make it possible to write a normal-ordered product in several ways, 
but with our conventions these are completely equivalent: 

j\ ( Cp Cq C,' ) — ( \jn f/p Oq — ( 1) O'JoqOp. 

Using these definitions, it is not hard to generalize Wick’s theorem. Con¬ 
sider first the case of two Dirac fields, say T [ip(x)ip{y)]. In analogy with (4.37), 
define the contraction of two fields by 

T[tp(x)Pp(y)\ .V [ //) + ip(x)Pp(y). (4.107) 

Explicitly, for the Dirac field, 


c(x)e(y) 


J {ip+(x),ip (y)} iorx°>y°'\ = _ 

1 -{^(y)^-(x)} for x° < y° ) W Vh 


(4.108) 


ip{x)ip{y) = ip{x)ip{y) = 0. 


(4.109) 


Define contractions under the normal-ordering symbol to include minus signs 
for operator interchanges: 


iV(^i^ 2 ^ 3 ^ 4 ) = -^ 1^3 N(lp 2 lp 4 ) = -S F (x 1 -x 3 )N(ip 2 if> 4 ). (4.110) 

With these conventions, Wick’s theorem takes the same form as before: 

T\'tpiip 2 ip 3 ■ ■ ■] = N[ipiip 2 '>p 3 • • • + all possible contractions]. (4.111) 

The proof is essentially unchanged from the bosonic case, since all extra minus 
signs are accounted for by the above definitions. 

Yukawa Theory 

Writing down the Feynman rules for fermion correlation functions would now 
be easy, but instead let’s press on and discuss scattering processes. For defi¬ 
niteness, we begin by analyzing the Yukawa theory: 

H = F^Dirac ' /^Klein—Gordon ' / d X giplp(p. 


(4.112) 
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This is a simplified model of Quantum Electrodynamics. In this section we 
will carefully work out the rules of calculation for Yukawa theory, so that in 
the next section we can guess the rules for QED without too much difficulty. 
To be even more specific, consider the two-particle scattering reaction 

fermion(p) + fermion(£;) —> fermion(p') + ferinion(fc'). 

The leading contribution comes from the Hj term of the 5-matrix: 

o<P', k 'l T (^(— (■ ~ i9 )J d 4 f/^W’/$j)|p,k} 0 . (4.113) 

To evaluate this expression, use Wick’s theorem to reduce the T-product to 
an A T -product of contractions, then act the uncontracted fields on the initial- 
and final-state particles. Represent this latter process as the contraction 


'•/(J')p,s) = 


d 3 p' 


1 


= e 


(2n) 3 y m ^ 

- ,p - r u s (p ) |0). 


E- 


[ P " 


\p'Y 


| 0 ) 


(4.114) 


Similar expressions hold for the contraction of ipj with a final-state fermion, 
and for contractions of ipi and ip I with antifermion states. Note that ipi can 
be contracted with a fermion on the right or an antifermion on the left; the 
opposite is true for xfrj. 

We can write a typical contribution to the matrix element (4.113) as the 
contraction 

I I i ll 1 l~i ~l I 

<P , ,k , |4i (-ig)fd 4 xxljxb(l> (—ig)fd 4 y ibipcf) |p,k). (4.115) 

Up to a possible minus sign, the value of this quantity is 

i-idf f -prh 1 2 (2ir) 4 S(p'-p-q ) 

J (2tt) 4 q- ~ <n% 

x (27T ) 4 8(k' —k+q)u(p l )u(p)9(k l )u(k). 

(We have dropped the factor 1/2! because there is a second, identical term 
that comes from interchanging x and y in (4.115).) Using either delta func¬ 
tion to perform the integral, we find that this expression takes the form 
jA4(27r) 4 <5(Ep), with 

• 2 

iM = 0 — ^ u(p')u(p)u(k')u(k). (4.116) 

q- - m* 

When writing it in this way, we must remember to impose the constraints 
p — p' = q = k' — k. 
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Instead of working from (4.115), we could draw a Feynman diagram: 


We denote scalar particles by dashed lines, and fermions by solid lines. The S- 
matrix element could then be obtained directly from the following momentum- 
space Feynman rules. 

1. Propagators: 


<p(x)<p(y) = 


ib{x)%p(y) = 


i 

q 2 - to| + ie 

i(tf + to) 
pr — to 2 + ie 


2. Vertices: 


= -ig 


3. External leg contractions: 


■ 4 > 


i—i 


<p |q> = 

= l 

V’llbs) = 

= u s (p ) 

fermion 


IM) = 

= v s (k) 


I—1 

<q| <t> = 

= 1 

(p,s| $ = 

= u s (p ) 

fermion 

(k, s # = 

= v s (k) 


antifermion 


antifermion 


4. Impose momentum conservation at each vertex. 

5. Integrate over each undetermined loop momentum. 

6. Figure out the overall sign of the diagram. 

Several comments are in order regarding these rules. 

First, note that the 1 /nl from the Taylor series of the time-ordered expo¬ 
nential is always canceled by the nl ways of interchanging vertices to obtain 
the same contraction. The diagrams of Yukawa theory never have symmetry 
factors, since the three fields ( ipip<f >) in Hi cannot substitute for one another 
in contractions. 

Second, the direction of the momentum on a fermion line is always signifi¬ 
cant. On external lines, as for bosons, the direction of the momentum is always 
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ingoing for initial-state particles and outgoing for final-state particles. This 
follows immediately from the expansions of i/j and i/j, where the annihilation 
operators a p and b p both multiply e~ lp ' x and the creation operators and b J, 
both multiply e +lp ' x . On internal fermion lines (propagators), the momentum 
must be assigned in the direction of particle-number flow (for electrons, this 
is the direction of negative charge flow). This requirement is most easily seen 
by working out an example from first principles. Consider the annihilation of 
a fermion and an antifermion into two bosons: 


= < k , k 'l fcPxcp'ipip fd 4 y 4>iplp\p,p') 


~ / d l x [d 4 yv(p')e- ip '- x f u (p) e ~ ip - v . 

J J J (27t) 4 q-—m- 

The integrals over x and y give delta functions that force q to flow from y to x, 
as shown. On internal boson lines the direction of the momentum is irrelevant 
and may be chosen for convenience, since Dp(x — y) = Dp (y — x). 

It is conventional to draw arrows on fermion lines, as shown, to represent 
the direction of particle-number flow. The momentum assigned to a fermion 
propagator then flows in the direction of this arrow. For external antiparticles, 
however, the momentum flows opposite to the arrow; it helps to show this 
explicitly by drawing a second arrow next to the line. 

Third, note that in our examples the Dirac indices contract together along 
the fermion lines. This will also happen in more complicated diagrams: 


~ u(p 3 ) ■ 


i{yt 2 + m) 
p\ — m 2 


i{tfi + m) 
p\ — m 2 


u(po). 


(4.117) 


Finally, let’s take a moment to worry about fermion minus signs. Return 
to the example of the fermion-fermion scattering process. We adopt a sign 
convention for the initial and final states: 

|p, k ) ~ «p a k 1°). (p', k 'l ~ <0| «k'a P ', (4.118) 

so that (| p,k)Y = ( p,k\ . Then the contraction 


I l—1 I I I-1 I I——I! I J I- 1 ' 

(p', k '|(^).r ('ipip)y |p,k) ~ (0| ak'Op' 1p x 1p x %l) y %l)y c4/4 |0) 

can be untangled by moving if) two spaces to the left, and so picks up a factor 
of ( —l) 2 = 4-1. But note that in the contraction 
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it is sufficient to move the y one space to the left, giving a factor of —1. This 
contraction corresponds to the diagram 


The full result, to lowest order, for the 5-matrix element for this process 
is therefore 


iM = 


= {-ig 2 ) u(p')u(p) 


c p'-p y- - 


—u(k')u(k) 


(4.119) 


- u(p')u(k) - ju(k')u(p) 

(p'-ky -m) 

The minus sign difference between these diagrams is a reflection of Fermi 
statistics. Turning this expression into an explicit cross section would require 
some additional work; we postpone such calculations until Chapter 5, when 
we can work with QED instead of the less interesting Yukawa theory. 

In complicated diagrams, one can often simplify the determination of the 
minus signs by noting that the product {ipip), or any other pair of fermions, 
commutes with any operator. Thus, 


_[—Tl—1 I I _[—II—II—1 

• • • {yy) x {yy) y {yy) z {yy) w ••• = ••• (+i)(yy) x (yy)z(yy) v (yy)w ■ ■ ■ 

= ■ ■■ S F (x - z)S F (z - y)S F (y - w) ■ ■ ■. 
But note that in a closed loop of n fermion propagators we have 


JTnrL r i r—571. 

= tpy itnp yy yy 

n. n. 

= (— 1 ) tr \y yy yy yy y\ 

= (-1) tr [S F S F S F S F ]. (4.120) 

A closed fermion loop always gives a factor of —1 and the trace of a product 
of Dirac matrices. 
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The Yukawa Potential 

We now have all the formal rules we need to compute scattering amplitudes 
in Yukawa theory. Before going on to discuss QED, let us briefly descend from 
abstraction to concrete physics, and consider one very simple application of 
these rules: the scattering of distinguishable fermions, in the nonrelativistic 
limit. By comparing the amplitude for this process to the Born approxima¬ 
tion formula from nonrelativistic quantum mechanics, we can determine the 
potential V(r) created by the Yukawa interaction. 

If the two interacting particles are distinguishable, only the first dia¬ 
gram in (4.119) contributes. To evaluate the amplitude in the nonrelativis¬ 
tic limit, we keep terms only to lowest order in the 3-momenta. Thus, up to 
0(P 2 ,P' 2 ,..), 

P = (to, p), 

P 1 = (m, p'), 

Using these expressions, we have 

(p'-pf = - ip'-pr+o(p 4 ). 



k = {m , k), 
k 1 = (to, k'). 


(4.121) 


where ? is a two-component constant spinor normalized to ? = 6 SS . The 

spinor products in (4.119) are then 


u s '(p')u s (p) = 2m?'*? = 2mS ss '; 
u r \k')u r (k) = 2 m?'*? = 2 niS rr \ 


(4.122) 


So our first physical conclusion is that the spin of each particle is separately- 
conserved in this nonrelativistic scattering interaction—a pleasing result. 
Putting together the pieces of the scattering amplitude (4.119), we find 

* 2 

iM = 7— - \ -— 2 m6 ss ' 2 mS rr '. (4.123) 

|P' — P|“ +TO- 

This should be compared with the Born approximation to the scattering am¬ 
plitude in nonrelativistic quantum mechanics, written in terms of the potential 
function F(x): 

tp'\iT\p) = -iV( q) (2n)6(E p i - E p ), (q = p' - p). (4.124) 


So apparently^ for the Yukawa interaction, 
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(The factors of 2 m in (4.123) arise from our relativistic normalization conven¬ 
tions, and must be dropped when comparing to (4.124), which assumes con¬ 
ventional nonrelativistic normalization of states. The additional d (3 *(p' — p) 
goes away when we integrate over the momentum of the target.) 

Inverting the Fourier transform to find V(x) requires a short calculation: 


VM = / 


d 3 q —g 2 
(2tt) 3 |q| 2 + m 2 


zkC- 

47T 2 J 


dqq- 


piqr _ p-iqr 


1 


iqr 


q 2 + mi 


1J 

ir J 


dq 


qe 1 ' 


Air' 2 ir J * q 2 + m d 

— OO 


(4.126) 


The contour of this integral can be closed above in the complex plane, and 
we pick up the residue of the simple pole at q = +im$. Thus we find 

cr 1 

V(r) = -t— e~ m * r i (4.127) 

47T r 

an attractive “Yukawa potential”, with range 1/m^ = /i/mgc, the Compton 
wavelength of the exchanged boson. Yukawa made this potential the basis for 
his theory of the nuclear force, and worked backwards from the range of the 
force (about 1 fm) to predict the mass (about 200 MeV) of the required boson, 
the pion. 

What happens if instead we scatter particles off of anliparticles? For the 
process 


/i(p)/ 2 (fc) —» fi(p')f-2(k'), 


we need to evaluate (nonrelativistically) 

v s (k)v s ’(k') ^ J) = -2 m&“'. (4.128) 

We must also work out the fermion minus sign. Using |p,k) = c/^bl |0) and 
<p',k'| = <0| 6 k ' a V ', we can write the contracted matrix element as 

(p',k'| tbip lijlb |p,k) = (0| 6k'a|>' |0). 

To untangle the contractions requires three operator interchanges, so there is 
an overall factor of —1. This cancels the extra minus sign in (4.128), and there¬ 
fore we see that the Yukawa potential between a fermion and an antifermion 
is also attractive, and identical in strength to that between two fermions. 
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The remaining case to consider is scattering of two antifermions. It should 
not be surprising that the potential is again attractive; there is an additional 
minus sign from changing the other uu into vv, and the number of interchanges 
necessary to untangle the contractions is even. Thus we conclude that the 
Yukawa potential is universally attractive , whether it is between a pair of 
fermions, a pair of antifermions, or one of each. 

4.8 Feynman Rules for Quantum Electrodynamics 

Now we are ready to step from Yukawa theory to Quantum Electrodynamics. 
To do this, we replace the scalar particle <j> with a vector particle A M , and 
replace the Yukawa interaction Hamiltonian with 

// im j d'xc^YcA,,. (4.129) 

How do the Feynman rules change? The answer, though difficult to prove, is 
easy to guess. In addition to the fermion rules from the previous section, we 
have 


New vertex: 


= —ie"f p 

Photon propagator: 


q 2 + ie 

External photon lines: 

A n Ip) = 

= ep(p) 


l—1 

(Pi A r = 

= e£(p) 


Photons are conventionally drawn as wavy lines. The symbol e M (p) stands for 
the polarization vector of the initial- or final-state photon. 

To justify these rules, recall that in Lorentz gauge (which we employ to 
retain explicit relativistic invariance) the field equation for A M is 

<9% = 0. (4.130) 

Thus each component of A separately obeys the Klein-Gordon equation (with 
m = 0). The momentum-space solutions of this equation are e lx (p)e ~ ip ' x , 
where p 2 = 0 and e M (p) is any 4-vector. The interpretation of e as the polar¬ 
ization vector of the field should be familiar from classical electromagnetism. 
If we expand the quantized electromagnetic field in terms of classical solutions 
of the wave equation, as we did for the Klein-Gordon field, we find 


A »( x ) = 


/ 


d 3 p 


(2tt) 3 y/2 e; ^; 


3 

n 


a p f H 


(p)e 


— IP'X 


+«r 


(4.131) 
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where r = 0,1, 2,3 labels a basis of polarization vectors. The external line 
factors in the Feynman rules above follow immediately from this expansion, 
just as we obtained u’s and v’s as the external line factors for Dirac particles. 
The only subtlety is that we must restrict initial- and final-state photons to 
be transversely polarized: Their polarization vectors are always of the form 
e 1 ' = (0, e), where p • e = 0. For p along the z-axis, the right- and left-handed 
polarization vectors are = ( 0 , 1 , ±|, 0 )/\/ 2 . 

The form of the QED vertex factor is also easy to justify, by simply 
looking at the interaction Hamiltonian (4.129). Note that the 7 matrix in a 
QED amplitude will sit between spinors or other 7 matrices, with the Dirac 
indices contracted along the fermion line. Note also that this interaction term 
is specific to the case of an electron (and its antiparticle, the positron). In 
general, for a Dirac particle with electric charge Q |e|, 


= —iQ\ e W- 


For example, an electron has Q = —1, an up quark has Q = +2/3, and a 
down quark has Qd = —1/3. 

There is no easy way to derive the form of the photon propagator, so for 
now we will settle for a plausibility argument. Since the electromagnetic field 
in Lorentz gauge obeys the massless Klein-Gordon equation, it should come 
as no surprise that the photon propagator is nearly identical to the massless 
Klein-Gordon propagator. The factor of however, requires explanation. 

Lorentz invariance dictates that the photon propagator be an isotropic second- 
rank tensor that can dot together the 7 ^ and 7 " from the vertices at each 
end. The simplest candidate is g , “'. To understand the overall sign of the 
propagator, evaluate its Fourier transform: 


/ d l q ig tw iq .( x _ y ) 
J (27r) 4 q 2 + it 


f d 3 q 1 

J (2tt) 3 21q| 


e -iq-(x-y) 




(4.132) 


Presumably this is equal to (0| T[A /x (x)A v (y)] |0). Now set p = v, and take 
the limit x° —» y° from the positive direction. Then this quantity becomes the 
norm of the state A fl (x) |0), which should be positive. We see that our choice 
of signs in the propagator implies that the three states created by A { , with 
with i = 1, 2, 3, indeed have positive norm. These states include all real (non¬ 
virtual) photons, which always have spacelike polarizations. Unfortunately, 
because g yv is not positive definite, the states created by A 0 inevitably have 
negative norm. This is potentially a serious problem for any theory with vector 
particles. For Quantum Electrodynamics, we will show in Section 5.5 that the 
negative-norm states created by A 0 are never produced in physical processes. 
In Section 9.4 we will give a careful derivation of the photon propagator. 
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The Coulomb Potential 

As a simple application of these Feynman rules, and to better understand the 
sign of the propagator, let us repeat the nonrelativistic scattering calculation 
of the previous section, this time for QED. The leading-order contribution is 


iM = 


= (-ie) 2 u(p')~/ fl u(p) — lg _ l ‘^ 2 u(k l )Yu(k). (4.133) 


In the nonrelativistic limit, 

u(p')j°u(p) = u){p')u{p) ss + 2 m£' 1 '£. 


You can easily verify that the other terms, u(p')yu(p), vanish if p = p' = 0; 
they can therefore be neglected compared to u(p')j°u(p) in the nonrelativistic 
limit. Thus we have 

iM ~ _i /_ ■ goo 

. 2 (4-134) 

= | p ' -T p |2 ( 2m ^O P (‘ 2 m^Ok- 

Comparing this to the Yukawa case (4.123), we see that there is an extra 
factor of —1; the potential is a repulsive Yukawa potential with to = 0, that 
is, a repulsive Coulomb potential: 


V(r) 



a 

5 

r 


where a = e 2 /Ait ps 1/137 is the fine-structure constant. 
For particle-antiparticle scattering, note first that 


(4.135) 


v(k)j°v(k') = v\k)v(k') rs + 2 to^^'. 


The presence of the 7 0 eliminates the minus sign that we found in the Yukawa 
case. The nonrelativistic scattering amplitude is therefore 


iM = 


= (- 1 ) • |p,:;| 2 (^K^yY2TO^0„ (4.136) 


where the ( — 1) is the same fermion minus sign we saw in the Yukawa case. This 
is an attractive potential. Similarly, for antifermion-antifermion scattering one 
finds a repulsive potential. We have just verified that in quantum field theory, 
when a vector particle is exchanged, like charges repel while unlike charges 
attract. 
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Note that the repulsion in fermion-fermion scattering came entirely from 
the extra factor —goo = — 1 in the vector boson propagator. A tensor boson, 
such as the graviton, would have a propagator 

= 2 iw)( — fiw) + (—gija)(—gvp) S j ^ ^j_ 

which in nonrelativistic collisions gives a factor (—goo ) 2 = +1; this will result 
in a universally attractive potential. It is reassuring to see that quantum 
field theory does indeed reproduce the obvious features of the electric and 
gravitational forces: 

Exchanged particle ff and ff ff 

scalar (Yukawa) attractive attractive 

vector (electricity) repulsive attractive 

tensor (gravity) attractive attractive 



Problems 


4.1 Let us return to the problem of the creation of Klein-Gordon particles by a 
classical source. Recall from Chapter 2 that this process can be described by the 
Hamiltonian 


H = H 0 + 


/- 


i *(-i(f,x)0(a0), 


where Ho is the free Klein-Gordon Hamiltonian, <f>(x) is the Klein-Gordon field, and 
j(x) is a c-number scalar function. We found that, if the system is in the vacuum state 
before the source is turned on, the source will create a mean number of particles 





In this problem we will verify that statement, and extract more detailed information, 
by using a perturbation expansion in the strength of the source. 

(a) Show that the probability that the source creates no particles is given by 


P(0) 


<0| T | exp[ / jd A x j(x)cl>i(x)\ j |0) 


(b) Evaluate the term in P(0) of order j 2 , and show that P(0) = 1 — A + 0(j 4 ), 
where A equals the expression given above for (N). 

(c) Represent the term computed in part (b) as a Feynman diagram. Now t represent 
the whole pertubation series for P(0) in terms of Feynman diagrams. Show that 
this series exponentiates, so that it can be summed exactly: P(0) = exp(—A). 

(d) Compute the probability that the source creates one particle of momentum k. 
Perform this computation first to O(j) and then to all orders, using the trick of 
part (c) to sum the series. 
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(e) Show that the probability of producing n particles is given by 

P(n) = (l/nl)A” exp(—A). 

This is a Poisson distribution. 

(f) Prove the following facts about the Poisson distribution: 

oo oo 

^2 p (n) = 1; (N) = ^2 nP(n) = A. 

n=0 n=0 

The first identity says that the P(n)’s are properly normalized probabilities, 
while the second confirms our proposal for (N). Compute the mean square fluc¬ 
tuation ((N — (N)) 2 y 

4.2 Decay of a scalar particle. Consider the following Lagrangian, involving two 
real scalar fields 4> and <p: 

C = - 4m 2 $ 2 + h(d»<f>) 2 - Wf - 

The last term is an interaction that allows a <E> particle to decay into two (p’s, provided 
that M > 2m. Assuming that this condition is met, calculate the lifetime of the $ to 
lowest order in //,. 

4.3 Linear sigma model. The interactions of pions at low energy can be described 
by a phenomenological model called the linear sigma model. Essentially, this model 
consists of N real scalar fields coupled by a <p A interaction that is symmetric under 
rotations of the N fields. More specifically, let 4> ! (x), 1 = 1,..., N be a set of N fields, 
governed by the Hamiltonian 

h = j d 3 x (£(n*) 2 + |(v^) 2 + v($ 2 )), 

where (<3>*) 2 = # • $, and 

P($ 2 ) = im 2 ($ ,: ) 2 + ^(($ ,: ) 2 ) 2 

is a function symmetric under rotations of #. For (classical) field configurations of 
$*(.t) that are constant in space and time, this term gives the only contribution to H; 
hence, V is the field potential energy. 

(What does this Hamiltonian have to do with the strong interactions? There 
are two types of light quarks, u and d. These quarks have identical strong interac¬ 
tions, but different masses. If these quarks are massless, the Hamiltonian of the strong 
interactions is invariant to unitary transformations of the 2-component object ( u,d ): 

(“) =» exp(ia. ff /2^. 

This transformation is called an isospin rotation. If, in addition, the strong interactions 
are described by a vector “gluon” field (as is true in QCD), the strong interaction 
Hamiltonian is invariant to the isospin rotations done separately on the left-handed 
and right-handed components of the quark fields. Thus, the complete symmetry of 
QCD with two massless quarks is 51/(2) x SU( 2). It happens that 50(4), the group 
of rotations in 4 dimensions, is isomorphic to 51/(2) x SU( 2), so for N = 4, the linear 
sigma model has the same symmetry group as the strong interactions.) 
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(a) Analyze the linear sigma model for m 2 > 0 by noticing that, for A = 0, the 
Hamiltonian given above is exactly AT copies of the Klein-Gordon Hamiltonian. 
We can then calculate scattering amplitudes as perturbation series in the pa¬ 
rameter A. Show that the propagator is 

I-1 . 

<l>'(j-:<l-':;/: ! 4'-' D F (x — y), 

where Dp is the standard Klein-Gordon propagator for mass m, and that there 
is one type of vertex given by 

= -2i\(5^6 kl + S a S jk + S ik 5 jl ). 


(That is, the vertex between two <I ,1 s and two <F 2 s has the value (—2?A); that 
between four $ 1 s has the value (—6fA).) Compute, to leading order in A, the 
differential cross sections da/dfl, in the center-of-mass frame, for the scattering 
processes 

$ 1 <I> 2 —> (F 1 # 2 , $ 1 <F 1 —>. <T>’ 2 <E> 2 , and <F 1 $ 1 —> 4* 1 # 1 

as functions of the center-of-mass energy. 

(b) Now consider the case m 2 < 0: in 2 = —// 2 . In this case, V has a local maximum, 
rather than a minimum, at <!>' = 0. Since V is a potential energy, this implies 
that the ground state of the theory is not near <f>* = 0 but rather is obtained by 
shifting <F* toward the minimum of V. By rotational invariance, we can consider 
this shift to be in the Nth direction. Write, then, 

<F*(a’) = 7r*(.r), i = 1 , ..., N — 1 , 

<1> A (x) = v + <y(x), 

where v is a constant chosen to minimize V. (The notation rd suggests a pion 
field and should not be confused with a canonical momentum.) Show that, in 
these new coordinates (and substituting for v its expression in terms of A and //.), 
we have a theory of a massive a field and A r — 1 massless pion fields, interacting 
through cubic and quartic potential energy terms which all become small as 
A —> 0. Construct the Feynman rules by assigning values to the propagators and 
vertices: 


(c) Compute the scattering amplitude for the process 
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to leading order in A. There are now four Feynman diagrams that contribute: 


Show that, at threshold (p,; = 0), these diagrams sum to zero. (Hint: It may be 
easiest to first consider the specific process ttV 1 —> 7t 2 7t 2 , for which only the first 
and fourth diagrams are nonzero, before tackling the general case.) Show that, 
in the special case N = 2 (1 species of pion), the term of 0(p 2 ) also cancels. 

(d) Add to V a symmetry-breaking term, 

AV = -a<f> N , 

where a is a (small) constant. (In QCD, a term of this form is produced if the u 
and d quarks have the same nonvanishing mass.) Find the new value of v that 
minimizes V, and work out the content of the theory about that point. Show that 
the pion acquires a mass such that m 2 ~ a, and show that the pion scattering 
amplitude at threshold is now nonvanishing and also proportional to a. 

4.4 Rutherford scattering. The cross section for scattering of an electron by the 
Coulomb field of a nucleus can be computed, to lowest order, without quantizing the 
electromagnetic field. Instead, treat the field as a given, classical potential A^x). The 
interaction Hamiltonian is 

Hi = J d 3 xei>^ 'd’ A tl , 

where t(x) is the usual quantized Dirac field. 

(a) Show that the T-matrix element for electron scattering off a localized classical 
potential is, to lowest order, 

{p'\iT\p} = —ie u(p')'jP-u(p) • A^p' - p), 

where Ajj(q) is the four-dimensional Fourier transform of Ajj(x). 

(b) If A, (x) is time independent, its Fourier transform contains a delta function of 
energy. It is then natural to define 

(p'\iT\p) = iM-(2n)S(E f -E.,), 

where £) and Ef are the initial and final energies of the particle, and to adopt 
a new Feynman rule for computing Af: 


= -ie^Aft( q), 


where A M (q) is the three-dimensional Fourier transform of . 1 ( , (.r). Given this 
definition of A4, show that the cross section for scattering off a time-independent, 
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localized potential is 


1 1 d 3 pf 1 

Vi2Ri (2n) 3 2E f 


\M(pi p/)| 2 (2n)S(E f 


Ei ), 


where % is the particle’s initial velocity. This formula is a natural modification 
of (4.79). Integrate over \pf\ to find a simple expression for da/dtt. 

(c) Specialize to the case of electron scattering from a Coulomb potential (74° = 
Ze/4irr). Working in the nonrelativistic limit, derive the Rutherford formula, 


da a 2 Z 2 

dfl 4?n 2 v 4 sin 4 (0/2) 


(With a few calculational tricks from Section 5.1, you will have no difficulty 
evaluating the general cross section in the relativistic case; see Problem 5.1.) 



Chapter 5 


Elementary Processes of 
Quantum Electrodynamics 


Finally, after three long chapters of formalism, we are ready to perform some 
real relativistic calculations, to begin working out the predictions of Quantum 
Electrodynamics. First we will return to the process considered in Chapter 1, 
the annihilation of an electron-positron pair into a pair of heavier fermions. 
We will study this paradigm process in extreme detail in the next three sec¬ 
tions, then do a few more simple QED calculations in Sections 5.4 and 5.5. 
The problems at the end of the chapter treat several additional QED pro¬ 
cesses. More complete surveys of QED can be found in the books of Jauch 
and Rohrlich (1976) and of Berestetskii, Lifshitz, and Pitaevskii (1982). 

5.1 e + e~ —>- Introduction 

The reaction e + e _ —»■ p + is the simplest of all QED processes, but also 
one of the most important in high-energy physics. It is fundamental to the 
understanding of all reactions in e + e _ colliders, and is in fact used to calibrate 
such machines. The related process e + e _ —> qq (a quark-antiquark pair) is 
extraordinarily useful in determining the properties of elementary particles. 

In this section we will compute the unpolarized cross section for e + e _ —>- 
, to lowest order. In Chapter 1 we used elementary arguments to guess 
the answer (Eq. (1.8)) in the limit where all the fermions are massless. We 
now relax that restriction and retain the muon mass in the calculation. Re¬ 
taining the electron mass as well would be easy but pointless, since the ratio 
me/nij, ss 1/200 is much smaller than the fractional error introduced by ne¬ 
glecting higher-order terms in the perturbation series. 

Using the Feynman rules from Section 4.8, we can at once draw the dia¬ 
gram and write down the amplitude for our process: 


= v s (//)( ?>'")</•'(/)) 



</'(/,•)( ■it~ 1 l ')r r (k 1 ). 
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Rearranging this slightly and leaving the spin superscripts implicit, we have 

iM(e~(p)e + (p') -5- /j-(k)/j + (k')) = ^-(i’{p')^u{pfj (u(k)-/^v(k')j . (5.1) 

This answer for the amplitude M is simple, but not yet very illuminating. 

To compute the differential cross section, we need an expression for |A4| 2 , 
so we must find the complex conjugate of M. A bi-spinor product such as 
vj ij u can be complex-conjugated as follows: 

(v 7 m w)* = w^( 7 m )^ 7 °)^ = uH 7 m )^ 7 °i> = = u^'v. 

(This is another advantage of the ‘bar’ notation.) Thus the squared matrix 
element is 

\M\ 2 = ^ (v{p')"i IJ ‘u{p)u{p)'f v{p')^ (u(k)'y ll v(k')v(k')'y„u(k)}. (5.2) 

At this point we are still free to specify any particular spinors u s (p ), 
v s (p 1 ), and so on, corresponding to any desired spin states of the fermions. 
In actual experiments, however, it is difficult (though not impossible) to re¬ 
tain control over spin states; one would have to prepare the initial state from 
polarized materials and/or analyze the final state using spin-dependent mul¬ 
tiple scattering. In most experiments the electron and positron beams are 
unpolarized, so the measured cross section is an average over the electron and 
positron spins s and s'. Muon detectors are normally blind to polarization, so 
the measured cross section is a sum over the muon spins r and r'. 

The expression for \M\ 2 simplifies considerably when we throw away the 
spin information. We want to compute 

lY lY 5 Z 5 Z|X(s,s'^Ar')| 2 . 

s s' r r' 

The spin sums can be performed using the completeness relations from Sec¬ 
tion 3.3: 

y u s (p)u s (p ) = m; ^ v s (p)v s (p) = pi — m. (5.3) 

S S 

Working with the first half of (5.2), and writing in spinor indices so we can 
freely move the v next to the v, we have 

Yv^P'^bUbiPMiPhcdviip') = (/ - m )dalabW+ m )bailed 

s,s' 

= trace [{j/ — m) 7 M (j/ + m) 7 "]. 

Evaluating the second half of (5.2) in the same way, we arrive at the desired 
simplification: 

I Y l^l 2 = 7f7 tr[(/-m e ) 7 '‘(/+m e ) 7 1 '] tr . 

spins 

(5.4) 
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The spinors u and v have disappeared, leaving us with a much cleaner expres¬ 
sion in terms of 7 matrices. This trick is very general: Any QED amplitude 
involving external fermions, when squared and summed or averaged over spins, 
can be converted in this way to traces of products of Dirac matrices. 


Trace Technology 

This last step would hardly be an improvement if the traces had to be la¬ 
boriously computed by brute force. But Feynman found that they could be 
worked out easily by appealing to the algebraic properties of the 7 matrices. 
Since the evaluation of such traces occurs so often in QED calculations, it is 
worthwhile to pause and attack the problem systematically, once and for all. 

We would like to evaluate traces of products of n gamma matrices, where 
n = 0,1,2,.... (For the present problem we need n = 2,3,4.) The n = 0 
case is fairly easy: tr 1 =4. The trace of one 7 matrix is also easy. From the 
explicit form of the matrices in the chiral representation, we have 


tr Y = tr 


0 



= 0 . 


It is useful to prove this result in a more abstract way, which generalizes to 
an arbitrary odd number of 7 matrices: 

tiY' = i^YYY since (7 ;:l ) 2 = 1 

= -tr 7 ° 7 ,, 7 5 since { 7 '', 7 5 } = 0 

= — ItYYY using cyclic property of trace 

= - tr Y ■ 


Since the trace of Y is equal to minus itself, it must vanish. For n -. matrices 
we would get n minus signs in the second step (as we move the second Y all 
the way to the right), so the trace must vanish if n is odd. 

To evaluate the trace of two 7 matrices, we again use the anticommutation 
properties and the cyclic property of the trace: 

tr YY = tr( 2 g''" • 1 — YY) (anticommuatation) 

= 8 - tr YY (cyclicity) 


Thus tr YY = ■ The trace of any even number of 7 matrices can be 

evaluated in the same way: Anticommute the first 7 matrix all the way to the 
right, then cycle it back to the left. Thus for the trace of four 7 matrices, we 
have 

tr {YYYY) = tr (2 g^YY - YYYY) 

= tr (2 g ^YY - Y‘Yi'"'Y + YY‘2g f ‘ a - 


YYYY)- 
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Using the cyclic property on the last term and bringing it to the left-hand 
side, we find 

tr (VVyV) = (/ v tr 7 V - (f p tr y^y 17 + 9 ,Ja tr 7 V 

In this manner one can always reduce a trace of n 7 -matrices to a sum of 
traces of (n — 2) 7 -matrices. The case n = 6 is easy to work out, but has 
fifteen terms (the number of ways of grouping the six indices in pairs to make 
terms of the form g pv g pa g al3 ). Fortunately, we will not need it in this book. 
(If you ever do need to evaluate such complicated traces, it may be easier to 
learn to use one of the several computer programs that can perform symbolic 
manipulations on Dirac matrices.) 

Starting in Section 5.2, we will often need to evaluate traces involving y 3 . 
Since 7 ® = * 7 ° 7 1 7 2 7 3 , the trace of 7 0 times any odd number of other 7 
matrices is zero. It is also easy to show that the trace of y 3 itself is zero: 

try 3 = tr( 7 ° 7 ° 7 5 ) = -tr( 7 ° 7 5 7 °) = -tr( 7 ° 7 ° 7 5 ) = - try 3 . 

The same trick works for My^y^y 3 ), if we insert two factors of y Q for some a 
different from both p. and v. The first nonvanishing trace involving y 3 contains 
four other 7 matrices. In this case the trick still works unless every 7 matrix 
appears, so ti{'y p Yl p l <J l°) = 0 unless (pvpa) is some permutation of (0123). 
From the anticommutation rules it also follows that interchanging any two of 
the indices simply changes the sign of the trace, so tr^y^y^y^y 3 ) must be 
proportional to e tlvpa . The overall constant turns out to be —4 i, as you can 
easily check by plugging in (pvpa) = (0123). 

Here is a summary of the trace theorems, for convenient reference: 

tr(l) = 4 
tr(any odd # of y’s) = 0 

tr(y"y i ') = 4 g pv 

tr(fVW) = 4 (<r <r - rr + <r ) (5.5) 

tr(y 5 ) = 0 

tr(y'V7 5 ) = 0 

tr^'y'Vy'V) = -4 ie f “' pa 

Expressions resulting from use of the last formula can be simplified by means 
of the identities 

e a ^ fS e a0lS = -24 

e aplp c al 3 7 „ = —6 8% (5.6) 

e a ^e aPpa = -2 {S^ - 8^6%) 

All of these can be derived by first appealing to symmetry arguments, then 
evaluating one special case to determine the overall constant. 
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Another useful identity allows one to reverse the order of all the 7 matrices 
inside a trace: 

tr( 7 ^7V7 ff • • •) = tr(- • • y'VbV')- (5-7) 

To prove this relation, consider the matrix C = 7 ° 7 2 (essentially the charge- 
conjugation operator). This matrix satisfies C 2 = 1 and Cj^C = —( 7 M ) T . 
Thus if there are n 7 -matrices inside the trace, 

tr(Y‘y ■■■)= tr(C^C eye •••) 

= (-l)"tr[(7'‘) T (7 !y f •••] 

= tr(-.. 7 V), 

since the trace vanishes unless n is even. It is easy to show that the reversal 
identity (5.7) is also valid when the trace contains one or more factors of y 5 . 

When two 7 matrices inside a trace are dotted together, it is easiest to 
eliminate them before evaluating the trace. For example, 

= 9^9^ = 4- (5.8) 

The following contraction identities , all easy to prove using the anticommu¬ 
tation relations, can be used when other 7 matrices lie in between: 

7'V7„ = -27" 

YYYlr = ±9 l ' p (5-9) 

Y'YYYitJ. = -2YYY 

Note the reversal of order in the last identity. 

All of the 7 matrix identities proved in this section are collected for ref¬ 
erence in the Appendix. 

Unpolarized Cross Section 

We now return to the evaluation of the squared matrix element, Eq. (5.4). 
The electron trace is 

tr[(/ - m e )Y(tf + m e )Y] = 4 [p' p p v + p"'p fl - g^ip-p 1 + m 2 e )\. 

The terms with only one factor of m vanish, since they contain an odd number 
of 7 matrices. Similarly, the muon trace is 

tr[(#-F m„) 7 „(^' - m ;( )y„] = 4 [k tl k' v + k u k' fl - g^{k-k' +m 2 )]. 

From now on we will set m e = 0, as discussed at the beginning of this section. 
Dotting these expressions together and collecting terms, we get the simple 
result 

\ l-^l 2 = ^r[(p-k)(p'-k') + (p-k')(p'-k) + m£(p-p')]. (5.10) 

spins ^ 
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To obtain a more explicit formula we must specialize to a particular frame 
of reference and express the vectors p, p', k, k ', and q in terms of the basic kine¬ 
matic variables—energies and angles—in that frame. In practice, the choice 
of frame will be dictated by the experimental conditions. In this book, we will 
usually make the simplest choice of evaluating cross sections in the center-of- 
mass frame. For this choice, the initial and final 4-momenta for e + e _ —> p + p~ 
can be written as follows: 


To compute the squared matrix element we need 

q 2 = (p + ^) 2 =4E 2 ; p-p' = 2E 2 - 

p-k = p'-k! = E 1 — E|k| cos#; p-k! = p'-k = E 2 + E|k| cos#. 
We can now rewrite Eq. (5.10) in terms of E and #: 

\ i yVJ i 2 = [ E ' 2 ( E - i k i cos61 ) 2 + E ' 2( - E +i k i cos 61 ) 2 + 2 m l E ' 2 } 

spins 


= e 


4 




(5.11) 


All that remains is to plug this expression into the cross-section formula 
derived in Section 4.5. Since there are only two particles in the final state and 
we are working in the center-of-mass frame, we can use the simplified formula 
(4.84). For our problem |ru — ug| = 2 and E^ = Eg = E cm /2, so we have 


da 

dfl 


2El m 16t t 2 E c 


7 E |A«I S 


spins 


a 


E 2 




4 E 2 

cm 

Integrating over dfl, we find the total cross section: 

47rcr 


^total — 


3 E 2 1U 


1 

E 2 \ 2 E 2 


(5.12) 


(5.13) 
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Figure 5.1. Energy dependence of the total cross section for e + e -5- //.+ //. , 
compared to “phase space” energy dependence. 


In the high-energy limit where E m /( , these formulae reduce to those given 
in Chapter 1: 


da 

dfl E^ni^ 


-^(1 + cos 2 6 ); 


^total 


E^>m„ 



(5.14) 


Note that these expressions have the correct dimensions of cross sections. 
In the high-energy limit, E cm is the only dimensionful quantity in the problem, 
so dimensional analysis dictates that <7 to tai * ^cjn- Since we knew from the 
beginning that <7 to tai oc a 2 , we only had to work to get the factor of 47t/3. 

The energy dependence of the total cross-section formula (5.13) near 
threshold is shown in Fig. 5.1. Of course the cross section is zero for E cm < 
2m M . It is interesting to compare the shape of the actual curve to the shape 
one would obtain if \M\ 2 did not depend on energy, that is, if all the energy 
dependence came from the phase-space factor |k|/£7. To test Quantum Elec¬ 
trodynamics, an experiment must be able to resolve deviations from the naive 
phase-space prediction. Experimental results from pair production of both 
li. and r leptons confirm that these particles behave as QED predicts. Fig¬ 
ure 5.2 compares formula (5.13) to experimental measurements of the t + t~ 
threshold. 

Before discussing our result further, let us pause to summarize how we 
obtained it. The method extends in a straightforward way to the calculation 
of unpolarized cross sections for other QED processes. The general procedure 
is as follows: 
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Figure 5.2. The ratio <j(e + e _ — > r + r _ )/<j(e + e _ —S- /t+jt - ') of measured 
cross sections near the threshold for t + t~ pair-production, as measured 
by the DELCO collaboration, W. Bacino, et. al., Phvs. Rev. Lett. 41, 13 
(1978). Only a fraction of r decays are included, hence the small overall 
scale. The curve shows a fit to the theoretical formula (5.13), with a small 
energy-independent background added. The fit yields m T = 1782+:; MeV. 

1. Draw the diagram(s) for the desired process. 

2. Use the Feynman rules to write down the amplitude M. 

3. Square the amplitude and average or sum over spins, using the complete¬ 
ness relations (5.3). (For processes involving photons in the final state 
there is an analogous completeness relation, derived in Section 5.5.) 

4. Evaluate traces using the trace theorems (5.5); collect terms and simplify 
the answer as much as possible. 

5. Specialize to a particular frame of reference, and draw a picture of the 
kinematic variables in that frame. Express all 4-momentum vectors in 
terms of a suitably chosen set of variables such as E and 6. 

6 . Plug the resulting expression for |A1| 1 2 3 4 5 6 into the cross-section formula 
(4.79), and integrate over phase-space variables that are not measured 
to obtain a differential cross section in the desired form. (In our case 
these integrations were over the constrained momenta k' and |k|, and 
were performed in the derivation of Eq. (4.84).) 

While other calculations (especially those involving loop diagrams) often re¬ 
quire additional tricks, nearly every QED calculation will involve the basic 
procedures outlined here. 
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Production of Quark-Antiquark Pairs 

The asymptotic energy dependence of the e + e _ —> /x + p _ cross-section formula 
sets the scale for all e + e _ annihilation cross sections. A particularly important 
example is the cross section for 

e + e _ —1 hadrons, 

that is, the total cross section for production of any number of strongly inter¬ 
acting particles. 

In our current understanding of the strong interactions, given by the the¬ 
ory called Quantum Chromodynamics (QCD), all hadrons are composed of 
Dirac fermions called quarks. Quarks appear in a variety of types, called fla¬ 
vors, each with its own mass and electric charge. A quark also carries an 
additional quantum number, color, which takes one of three values. Color 
serves as the “charge” of QCD, as we will discuss in Chapter 17. 

According to QCD, the simplest e + e _ process that ends in hadrons is 

e + e _ —)• qq, 

the annihilation of an electron and a positron, through a virtual photon, into a 
quark-antiquark pair. After they are created, the quarks interact with one an¬ 
other through their strong forces, producing more quark pairs. Eventually the 
quarks and antiquarks combine to form some number of mesons and baryons. 

To adapt our results for muon production to handle the case of quarks, 
we must make three modifications: 

1. Replace the muon charge e with the quark charge Q\e\. 

2. Count each quark three times, one for each color. 

3. Include the effects of the strong interactions of the produced quark and 

antiquark. 

The first two changes are easy to make. For the first, it is simply necessary to 
know the masses and charges of each flavor of quark. For u, c, and t quarks 
we have Q = 2/3, while for d, s, and b quarks we have Q = —1/3. The cross- 
section formulae are proportional to the square of the charge of the final-state 
particle, so we can simply insert a factor of Q 2 into any of these formulae 
to obtain the cross section for production of any particular variety of quark. 
Counting colors is necessary because experiments measure only the total cross 
section for production of all three colors. (The hadrons that are actually de¬ 
tected are colorless.) In any case, this counting is easy: Just multiply the 
answer by 3. 

If you know a little about the strong interaction, however, you might 
think this is all a big joke. Surely the third modification is extremely difficult 
to make, and will drastically alter the predictions of QED. The amazing truth 
is that in the high-energy limit, the effect of the strong interaction on the 
quark production process can be completely neglected. As we will discuss in 
Part III, the only effect of the strong interaction (in this limit) is to dress 
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up the final-state quarks into bunches of hadrons. This simplification is due 
to a phenomenon called asymptotic freedom ; it played a crucial role in the 
identification of Quantum Chromodynamics as the correct theory of the strong 
force. 

Thus in the high-energy limit, we expect the cross section for the reaction 
e + e _ —i qq to approach 3 ■ Q 2 ■ 4ira 2 /3E 2 m . It is conventional to define 


1 unit of R = 


47 tot 


86 .8nbarns 
(E cm in GeV) 2 ' 


(5.15) 


The value of a cross section in units of R is therefore its ratio to the asymptotic 
value of the e + e _ —l cross section predicted by Eq. (5.14). Experimen¬ 

tally, the easiest quantity to measure is the total rate for production of all 
hadrons. Asymptotically, we expect 


a(e + e —1 hadrons) 



(5.16) 


where the sum runs over all quarks whose masses are smaller than E cm /2. 
When E cm /2 is in the vicinity of one of the quark masses, the strong interac¬ 
tions cause large deviations from this formula. The most dramatic such effect 
is the appearance of bound states just below E cm = 2m q , manifested as very- 
sharp spikes in the cross section. 

Experimental measurements of the cross section for e + e _ annihilation to 
hadrons between 2.5 and 40 GeV are shown in Fig. 5.3. The data shows three 
distinct regions: a low-energy region in which u, d, and s quark pairs are 
produced; a region above the threshold for production of c quark pairs; and 
a region also above the threshold for b quark pairs. The prediction (5.16) is 
shown as a set of solid lines; it agrees quite well with the data in each region, 
as long as the energy is well away from the thresholds where the high-energy 
approximation breaks down. The dotted curves show an improved theoretical 
prediction, including higher-order corrections from QCD, which we will discuss 
in Section 17.2. This explanation of the e + e _ annihilation cross section is a 
remarkable success of QCD. In particular, experimental verification of the 
factor of 3 in (5.16) is one piece of evidence for the existence of color. 

The angular dependence of the differential cross section is also observed 
experimentally.* At high energy the hadrons appear in jets, clusters of several 
hadrons all moving in approximately the same direction. In most cases there 
are two jets, with back-to-back momenta, and these indeed have the angular 
dependence (1 + cos 2 6). 


*Tlie basic features of liadron production in high-energy e + e annihilation are 
reviewed by P. Duinker, Rev. Mod. Phvs. 54, 325 (1982). 
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Figure 5.3. Experimental measurements of tlie total cross section for the 
reaction e + e - —S- hadrons, from the data compilation of M. Swartz, Phys. 
Rev. D53, 5268 (1996). Complete references to the various experiments are 
given there. The measurements are compared to theoretical predictions from 
Quantum Chromodynamics, as explained in the text. The solid line is the 
simple prediction (5.16). 


5.2 e + e — > ii + fi : Helicity Structure 

The unpolarized cross section for a reaction is generally easy to calculate 
(and to measure) but hard to understand. Where does the (1 + cos 2 6) angu¬ 
lar dependence come from? We can answer this question by computing the 
e + e _ —I p + p~ cross section for each set of spin orientations separately. 

First we must choose a basis of polarization states. To get a simple answer 
in the high-energy limit, the best choice is to quantize each spin along the 
direction of the particle’s motion, that is, to use states of definite helicity. 
Recall that in the massless limit, the left- and right-handed helicity states 
of a Dirac particle live in different representations of the Lorentz group. We 
might therefore expect them to behave independently, and in fact they do. 

In this section we will compute the polarized e + e _ —» p + p~ cross sections, 
using the helicity basis, in two different ways: first, by using trace technology 
but with the addition of helicity projection operators to project out the desired 
left- or right-handed spinors; and second, by plugging explicit expressions for 
these spinors directly into our formula for the amplitude M . Throughout this 
section we work in the high-energy limit where all fermions are effectively 
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massless. (The calculation can be done for lower energy, but it is much more 
difficult and no more instructive.) 1 ' 

Our starting point for both methods of calculating the polarized cross 
section is the amplitude 

iM (e~(p)e + (p') -4- p~(k)p + (k')) = (v(p')^u(p)} (u(fc)7//u(fc')) • (5-1) 

We would like to use the spin sum identities to write the squared amplitude 
in terms of traces as before, even though we now want to consider only one 
set of polarizations at a time. To do this, we note that for massless fermions, 
the matrices 


1 + 7 3 (0 (A 1-7 5 (1 0\ 

2 VOlJ’ 2 _ v 0 °y 


(5.17) 


are projection operators onto right- and left-handed spinors, respectively. Thus 
if in (5.1) we make the replacement 

v (p 1 ) y u (p) —■> v (p 1 ) 7 m ( ) u (p ), 

the amplitude for a right-handed electron is unchanged while that for a left- 
handed electron becomes zero. Note that since 

v(p'W = v Yp') (-^-)7°7 >l u(p), (5.18) 

this same replacement imposes the requirement that v(p') also be a right- 
handed spinor. Recall from Section 3.5, however, that the right-handed spinor 
v(p') corresponds to a left -handed positron. Thus we see that the annihilation 
amplitude vanishes when both the electron and the positron are right-handed. 
In general, the amplitude vanishes (in the massless limit) unless the electron 
and positron have opposite helicity, or equivalently, unless their spinors have 
the same helicity. 

Having inserted this projection operator, we are now free to sum over the 
electron and positron spins in the squared amplitude; of the four terms in the 
sum, only one (the one we want) is nonzero. The electron half of |yV(| 2 , for a 
right-handed electron and a left-handed positron, is then 

vip'h 1 * (-^r~) u (p) u{pYf u 

spins spins 



iTlie general formalism for S-matrix elements between states of definite helicity is 
presented in a beautiful paper of M. Jacob and G. C. Wick, Ann. Phys. 7, 404 (1959). 
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= 2 (p-y + p'V - - ^ a ^p' a Pp) ■ (5-19) 


The indices in this expression are to be dotted into those of the muon half 
of the squared amplitude. For a right-handed p~ and a left-handed p + , an 
identical calculation yields 


1 

spins 


2 (k fl kl +k v k J, - g pv k ■ k' -ie pp(TV k p k la ). (5.20) 


Dotting (5.19) into (5.20), we find that the squared matrix element for e R e R —> 
in the center-of-mass frame is 

|-Ad| 2 = ~^r[ 2 (p-k)(p'-k') +2 (p-k')(p'-k) - e° l ' ll3v € pll(TV p' a p l3 k p k" r ^ 

= [(p • k)(p' ■ k') + (p-k 1 ) ( p ' • k) - (p■ k)(p‘ ■ k!) + (p ■ k 1 )(p‘ ■ k )j 

= —(p.k')( P '.k) 

= e 4 (l + cos 0) 2 . (5.21) 

Plugging this result into (4.85) gives the differential cross section, 

%( e R e L = J ^( 1 + cos9 ) 2 - ( 5 - 22 ) 

There is no need to repeat the entire calculation to obtain the other 
three nonvanishing helicity amplitudes. For example, the squared amplitude 
for e R e R —> p. R p. R identical to (5.20) but with 7 0 replaced by — 7 0 on the 
left-hand side, and thus e ppav replaced by —e pltav on the right-hand side. 
Propagating this sign though (5.21), we easily see that 

%( e R e i = i|M 1_cos ^) 2 - < 5 - 23 ) 

Similarly, 

>/'«/'/'.) = t^(1-cos<?) 2 ; 

c “ (5.24) 

^( e Z4 ^pIpr) = -J^-(i + cos 0 ) 2 - 

(These two results actually follow from the previous two by parity invariance.) 
The other twelve helicity cross sections (for instance, e))e R —> P~[Pr) are zer °i 
as we saw from Eq. (5.18). Adding up all sixteen contributions, and dividing 
by 4 to average over the electron and positron spins, we recover the unpolarized 
cross section in the massless limit, Eq. (5.14). 
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Figure 5.4. Conservation of angular momentum requires that if the 2 - 
component of angular momentum is measured, it must have the same value 
as initially. 


Note that the cross section (5.22) for vanishes at 9 = 180°. 

This is just what we would expect, since for 6 = 180°, the total angular mo¬ 
mentum of the final state is opposite to that of the initial state (see Figure 5.4). 

This completes our first calculation of the polarized e + e _ —> p + p~ cross 
sections. We will now redo the calculation in a manner that is more straight¬ 
forward, more enlightening, and no more difficult. We will calculate the am¬ 
plitude M (rather than the squared amplitude) directly, using explicit values 
for the spinors and 7 matrices. This method does have its drawbacks: It forces 
us to specialize to a particular frame of reference much sooner, so manifest 
Lorentz invariance is lost. More pragmatically, it is very cumbersome except 
in the nonrelativistic and ultra-relativistic limits. 

Consider again the amplitude 
2 

M = ^7 (v(p')Yu{p)} (u{k). (5.25) 


In the high-energy limit, our general expressions for Dirac spinors become 

Vp • 


u(p) = 
v{p) = 


p ' a ^ 

Vp ■ E ^°° Vv(l +P-<r)t 


s/p-a£ 

— sjp • <r£ J A "4 00 


V2E( f (1 

V-n(l +P-<r)i 


(5.26) 


A right-handed spinor satisfies (p ■ <r)£ = +£, while a left-handed spinor has 
( p-(r)£, = —£. (Remember once again that for antiparticles, the handedness of 
the spinor is the opposite of the handedness of the particle.) We must evaluate 
expressions of the form Cq'' 11 . so we need 


7 V = 





(5.27) 
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Thus we see explicitly that the amplitude is zero when one of the spinors is 
left-handed and the other is right-handed. In the language of Chapter 1, the 
Clebsch-Gordan coefficients that couple the vector photon to the product of 
such spinors are zero; those coefficients are just the off-block-diagonal elements 
of the matrix (in the chiral representation). 

Let us choose p and p' to be in the indirections, and first consider the 
case where the electron is right-handed and the positron is left-handed: 


Thus for the electron we have £ = Q, corresponding to spin up in the n 
direction, while for the positron we have £ = ( < j ) ), also corresponding to (phys¬ 
ical) spin up in the ndirecton. Both particles have (p-<r)£ = +£, so the spinors 
are 


u(p) = V2E 

0 

1 1 

( 0 \ 

; v(p') = V2E ° 


\(J 

\-l) 


The electron half of the matrix element is therefore 

v(p l h , ‘u(p) = 2E(0, -IKQ = —2£(0,U,0). (5.29) 

We can interpret this expression by saying that the virtual photon has circular 
polarization in the +Indirection; its polarization vector is e + = ( 1 /\/ 2 )(x+iy). 

Next we must calculate the muon half of the matrix element. Let the p~ 
be emitted at an angle 8 to the 2 -axis, and consider first the case where it is 
right-handed (and the p + is therefore left-handed): 


To calculate w.(A;) 7 M u(fc') we could go back to expressions (5.26), but then it 
would be necessary to find the correct spinors £ corresponding to polarization 
along the muon momentum. It is much easier to use a trick: Since any expres¬ 
sion of the form transforms like a 4-vector, we can just rotate the result 
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(5.29). Rotating that vector by an angle # in the ;r 2 -plane, we find 

u{k)^/^v{k') = [v(k')j ,l u(k)]* 

= [—2E (0, cos#, i, — sin#)] * (5.30) 

= —2 E (0, cos#, —i, —sin#). 

This vector can also be interpreted as the polarization of the virtual pho¬ 
ton; when it has a nonzero overlap with (5.29), we get a nonzero amplitude. 
Plugging (5.29) and (5.30) into (5.25), we see that the amplitude is 

p' 2 

M(e R e~£ -> p R p R ) = —(‘2E) 2 (- cos# - 1) = -e 2 (l + cos#), (5.31) 

in agreement with (1.6), and also with (5.21). The differential cross section for 
this set of helicities can now be obtained in the same way as above, yielding 
(5.22). 

We can calculate the other three nonvanishing helicity amplitudes in an 
analogous manner. For a left-handed electron and a right-handed positron, we 
easily find 

v(p')Y'u(p) = —‘2E (0,1, —i, 0) = -2 E- \/2e(\ 

Perform a rotation to get the vector corresponding to a left-handed p~ and a 
right-handed p + : 

u(k) r / ,1 v(k') = — 2E (0, cos#, i, sin#). 

Putting the pieces together in various ways yields the remaining amplitudes, 

-f Px/4) = -e 2 (l + cos#); 

o (5.32) 

M{( = M(e R e£ ->■ p, R p,^) = -e 2 (l - cos#). 
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Now let us go to the other end of the energy spectrum, and discuss the re¬ 
action e + e _ —> p + p~ in the extreme nonrelativistic limit. When E is barely 
larger than m M , our previous result (5.12) for the unpolarized differential cross 
section becomes 


da ^ a 2 /~ m 2 a 2 |k| 

dn |^to 2E 2 m \ 1 ~~E 2 = 2E 2 m ~E' 


(5.33) 


We can recover this result, and also learn something about the spin de¬ 
pendence of the reaction, by evaluating the amplitude with explicit spinors. 
Once again we begin with the matrix element 

2 

M = ^7 (v(p'W'u(p)') (u.(k)^v(k')y 
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Figure 5.5. In the nonrelativistic limit the total spin of the system is con¬ 
served, and thus the muons are produced with both spins up along the 2 -axis. 


The electron and positron are still very relativistic, so this expression will be 
simplest if we choose them to have definite helicity. Let the electron be right- 
handed, moving in the + 2 -direction, and the positron be left-handed, moving 
in the — 2 -direction. Then from Eq. (5.29) we have 

v(p')y i u(p) = -2 E (0,1, i, 0). (5.34) 


In the other half of the matrix element we should use the nonrelativistic 
expressions 

(5.35) 

Keep in mind, in the discussion of this section, that the spinor £' gives the 
flipped spin of the antiparticle. Leaving the muon spinors £ and £' undeter¬ 
mined for now, we can easily compute 


u(k) = 


v(k') = \fm 


u{k)Yv(k') = m(£U + ) (j* ^ 

_ f 0 for p = 0, 

| —2for p, = i. 


(5.36) 


To evaluate M, we simply dot (5.34) into (5.36) and multiply by e 2 /q 2 = 
e 2 /Am 2 . The result is 


M(e R el ->■ p + p ) = -2e 2 t} ^ 


(5.37) 


Since there is no angular dependence in this expression, the muons are equally 
likely to come out in any direction. More precisely, they are emitted in an 
s-wave; their orbital angular momentum is zero. Angular momentum conser¬ 
vation therefore requires that the total spin of the final state equal 1, and 
indeed the matrix product gives zero unless both the muon and the antimuon 
have spin up along the 2 -axis (see Fig. 5.5). 
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To find the total rate for this process, we sum over muon spins to obtain 
M 2 = 4e 4 , which yields the cross section 


da 

dfl 


(e; 


R e L 


-t p + p ) = 


E 2 

^cm 


E 


(5.38) 


The same expression holds for a left-handed electron and a right-handed 
positron. Thus the spin-averaged cross section is just 2 • (1/4) times this ex¬ 
pression, in agreement with (5.33). 


Bound States 

Until now we have considered the initial and final states of scattering processes 
to be states of isolated single particles. Very close to threshold, however, the 
Coulomb attraction of the muons should become an important effect. Just 
below threshold, we can still form p + //“ pairs in electromagnetic bound states. 

The treatment of bound states in quantum field theory is a rich and 
complex subject, but one that lies mainly beyond the scope of this bookd 
Fortunately, many of the familiar bound systems in Nature can be treated (at 
least to a good first approximation) as nonrelativistic systems, in which the 
internal motions are slow. The process of creating the constituent particles out 
of the vacuum is still a relativistic effect, requiring quantum field theory for its 
proper description. In this section we will develop a formalism for computing 
the amplitudes for creation and annihilation of two-particle, nonrelativistic 
bound states. We begin with a computation of the cross section for producing 
a bound state in e + e _ annihilation. 

Consider first the case where the spins of the electron and positron both 
point up along the z-axis. From the preceding discussion we know that the 
resulting muons both have spin up, so the only type of bound state we can 
produce will have total spin 1, also pointing up. The amplitude for producing 
free muons in this configuration is 

M( tt^kit,k 2 t) = —2e 2 , (5.39) 

independent of the momenta (which we now call ky and k 2 ) of the muons. 

Next we need to know how to write a bound state in terms of free-particle 
states. For a general two-body system with equal constituent masses, the 
center-of-mass and relative coordinates are 

R = f (iq + r 2 ), r = ri—r 2 . (5.40) 

These have conjugate momenta 

K = ki+k 2 , k = 4(ki-k 2 ). (5.41) 

The total momentum K is zero in the center-of-mass frame. If we know the 
force between the particles (for /x+p _ , it is just the Coulomb force), we can 

^Reviews of this subject can be found in Bodwin, Yennie, and Gregorio, Rev. 
Mod. Phys. 57, 723 (1985), and in Sapirstein and Yennie, in Kinoshita (1990). 
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solve the nonrelativistic Schrodinger equation to find the Schrodinger wave- 
function, ip( r). The bound state is just a linear superposition of free states 
of definite r or k, weighted by this wavefunction. For our purposes it is more 
convenient to build this superposition in momentum space, using the Fourier 
transform of ip( r): 

^(k) = j d 3 xt ik r i dr j: |^(k )| 2 = 1. (5.42) 

If ip{ r) is normalized conventionally, ip(k) gives the amplitude for finding a 
particular value of k. An explicit expression for a bound state with mass 
M ss 2m, momentum K = 0, and spin 1 oriented up is then 

,_ f rl 3 k ~ 1 1 

\B) = V2M V'(k) |k t, -k t) • (5.43) 

J (27r) d V 2 m V 2m 

The factors of (1/x/2 m) convert our relativistically normalized free-particle 
states so that their integral with ?/i(k) is a state of norm 1. (The factors 
should involve \/2E± k, but for a nonrelativistic bound state, |k| <C m.) The 
outside factor of \J2M converts back to the relativistic normalization assumed 
by our formula for cross sections. These normalization factors could easily be 
modified to describe a bound state with nonzero total momentum K. 

Given this expression for the bound state, we can immediately write down 
the amplitude for its production: 

_ r r ] 3 k ~ 1 1 

M(n^B) = V2M —-P(k) -M(tt^kt, -k 1% (5.44) 

J (2ir) 3 \j2m \j2m 

Since the free-state amplitude from (5.39) is independent of the momenta of 
the muons, the integral over k gives ip*( 0 ), the position-space wavefunction 
evaluated at the origin. It is quite natural that the amplitude for creation of 
a two-particle state from a pointlike virtual photon should be proportional to 
the value of the wavefunction at zero separation. Assembling the pieces, we 
find that the amplitude is simply 

M(n^ B) = 2,*),.••(II). (5.45) 

In a moment we will compute the cross section from this amplitude. First, 
however, let us generalize this discussion to treat bound states with more 
general spin configurations. The analysis leading up to (5.37) will cast any S- 
matrix element for the production of nonrelativistic fermions with momenta 
k and —k into the form of a spin matrix element 

iM (something —>■ k,k') = ^[r(k)]^', (5.46) 

where T(k) is some 2x2 matrix. We now must replace the spinors with a nor¬ 
malized spin wavefunction for the bound state. In the example just completed, 



150 Chapter 5 Elementary Processes of Quantum Electrodynamics 


we replaced 

^ t_ >(l) (1 0)=(J (5.47) 

More generally, a spin-1 state is obtained by the replacement 

-j= n* • cr, (5.48) 

where n is a unit vector. Choosing n = (x + iy)/V 2 gives back (5.47), while 
the choices n = (x — iy)/V2 and n = z give the other two spin-1 states 
4-4- and (t4- + 4't)/v / 2- (The relative minus sign in (5.48) for this last case 
comes from the rule (3.135) for the flipped spin.) Similarly, the spin-zero 
state (t4- — 4/T)/\/2 is given by the replacement 

(5.49) 

involving the 2x2 unit matrix. With these rules, we can convert an 5-matrix 
element of the form (5.46) quite generally into an 5-matrix element for pro¬ 
duction of a bound state at rest: 

iM (something ->■ B) = y/yy j ip*( k) tr(^-^ r ( k )) , (5-50) 

where the trace is taken over 2-component spinor indices. For a spin-0 bound 
state, replace n • <x by the unit matrix. 

Vector Meson Production and Decay 

Equation (5.45) can be straightforwardly converted into a cross section for 
production of y + bound states in e + e _ annihilation. To make it easier to 
extract all the physics in this equation, let us introduce polarization vectors 
for the initial and final spin configurations: e + = (x + iy)/\/2, from Eq. (5.29), 
and n, from Eq. (5.48). Then (5.45) can be rewritten in a more invariant form 
as 

M{e~ R e + L -+B) = y/J(-2e 2 ) (n* • e+) d(0). (5.51) 

The bound state spin polarization n is projected parallel to e+. Note that if 
the electrons are initially unpolarized, the cross section for production of B 
will involve the polarization average 

|(|n* • e+1 2 + |n* • 6_| 2 ) = y((d) 2 + (n") 2 )- (5.52) 

Thus, the bound states produced will still be preferentially polarized along 
the e + e _ collision axis. 
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Assuming an unpolarized electron beam, and summing (5.52) over the 
three possible directions of n, we find the following expression for the total 
cross section for production of the bound state: 


cr(e+e —> B ) 


1 1 1 

2 2m 2m 


r d 3 i< i 
J (2/t) 3 2E k 


(27r) 4 <5W(p+p'-/f)-A(4e 4 )i|^(0)i 2 . 


(5.53) 

Notice that the 1-body phase space integral can remove only three of the 
four delta functions. It is conventional to rewrite the last delta function using 
S(P° - K°) = 2K°S(P 2 - K 2 ). Then 


(7 ( e + e - > B) 64ir 3 a 2l -^^6(E 2 m - M 2 ). (5.54) 

The last delta function enforces the constraint that the total center-of-mass 
energy must equal the bound-state mass; thus, the bound state is produced 
as a resonance in e+e - annihilation. If the bound state has a finite lifetime, 
this delta function will be broadened into a resonance peak. In practice, the 
intrinsic spread of the e+e - beam energy is often a more important broad¬ 
ening mechanism. In either case, (5.54) correctly predicts the area under the 
resonance peak. 

If the bound state B can be produced from e+e - , it can also annihilate 
back to e+e - , or to any other sufficiently light lepton pair. According to (4.86), 
the total width for this decay mode is given by 

T (B e+e") = ^ f dH- 2 \M\ 2 , (5.55) 


where M is just the complex conjugate of the matrix element (5.51) we used 
to compute B production. Thus 



(5.56) 


Now we must sum over electron polarization states and average over the three 
possible values of n. We thus obtain 


T (B ->■ e+e - ) 


I67ra 2 |'i/>(0)| 2 
~3 M 2 


(5.57) 


The formula for the decay width of B is very similar to that for the production 
cross section, and this is no surprise: Both calculations involve the square of 
the same matrix element, summed over initial and final polarizations. The two 
calculations differed only in how we formed the polarization averages, and in 
the phase-space factors. By this logic, the relation we have found between the 
two quantities, 


a{e + e 


-> B) = 4t r 


3 T(B ->■ e+e”) 
M 


8(E^-M 2 ), 


(5.58) 
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is very general and completely independent of the details of the matrix element 
computation. The factor 3 in (5.58) came from the orientation average for n; 
for a spin-J bound state, this factor would be (2 J + 1). 

The most famous application of this formalism is to bound states not of 
muons but of quarks: quarkonium. We saw the experimental evidence for qq 
bound states (the J/?/> and T, for example) in Fig. 5.3. (The resonance peaks 
are much too high and too narrow to show in the figure, but their sizes have 
been carefully measured.) Equations (5.54) and (5.57) must be multiplied 
by a color factor of 3 to give the production cross section and decay width 
for a spin-1 qq bound state. The value ?ii(0) of the qq wavefunction at the 
origin cannot be computed from first principles, but can be estimated from 
a nonrelativistic model of the qq spectrum with a phenomenologically chosen 
potential. Alternatively, we can use the formula 

FC B(qq) -+ e+e~) = 167 ra 2 Q 2 M^ (5.59) 

to measure ip{ 0) for a qq bound state. For example, the 15 spin-1 state of ss, 
the <j> meson, has an e + e _ partial width of 1.4 keV and a mass of 1.02 GeV. 
From this we can infer |-0(O )| 2 = (1.2fm) -3 . This result is physically reason¬ 
able, since hadronic dimensions are typically ~1 fm. 

Our viewpoint in this section has been quite different from that of earlier 
sections: Instead of computing everything from first principles, we have pieced 
together an approximate formula using a bit of quantum field theory and a bit 
of nonrelativistic quantum mechanics. In principle, however, we could treat 
bound states entirely in the relativistic formalism. Consider the annihilation 
of an e + e - pair to form a bound state, which subsequently decays back 

into e + e - . In our present formalism we might represent this process by the 
diagram 


The net process is simply e + e - —t e + e - (Bhabha scattering). What would 
happen if we tried to compute the Bhabha scattering cross section directly in 
QED perturbation theory? Obviously there is no p. + p - contribution in the 
tree-level diagrams: 
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As we go to higher orders in the perturbation series, however, we find (among 
others) the following set of diagrams: 


At most values of E cm , these diagrams give only a small correction to the 
tree-level expression. But when E cm is near the threshold, the dia¬ 

grams involving the exchange of photons within the muon loop contain the 
Coulomb interaction between the muons, and therefore become quite large. 
One must sum over all such diagrams, and it can be shown that this sum¬ 
mation is equivalent to solving the nonrelativistic Schrodinger equation.* The 
final prediction is that the cross section contains a resonance peak, whose area 
is given by (5.54) and whose width is given by (5.57). 

5.4 Crossing Symmetry 

Electron-Muon Scattering 

Now that we have completed our discussion of the process e + e _ —> 
let us consider a different but closely related QED process: electron-muon 
scattering, or e~/i~ —> e~. The lowest-order Feynman diagram is just the 
previous one turned on its side: 


= —t u (Pi )'7*' u(pi ) u(p' 2 )■7 m u( p 2 ). 


The relation between the processes e + e _ —> n + n~ and e~/ j~ —> e~/ j~ be- 
comes clear when we compute the squared amplitude, averaged and summed 
over spins: 

1 ^ 

4 l yV1 | 2 = jfT tr [(/i +m e )Y\ tr[(/ 2 +m M )7 M (y 2 +m M )7^]. 

spins 

This is exactly the same as our result (5.4) for e + e _ —» with the 

replacements 

p->Pi, p' -> ~p'i, k -> p 2 , k! —> -p-2- (5.60) 


*Tliis analysis is carried out in Berestetskii, Lifshitz, and Pitaevskii (1982). 
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So instead of evaluating the traces from scratch, we can just make the same 
replacements in our previous result, Eq. (5.10). Setting m e = 0, we find 

\'Y j \ m \ 2 = “X [(Pi "P [ 2 )(Pi -Pi) + (Pi -P2)(Pi -Pi) - mli’Pi -pi)] • (5.61) 

spins 

To evaluate this expression, we must work out the kinematics, which will 
be completely different. Working in the center-of-mass frame, we make the 
following assignments: 


The combinations we need are 

Pi ■ p ‘2 = Pi -Pi = k(E + k)\ Pi ■p -2 = pi -p-i = k(E -f k cos6*); 

Pi -Pi = k 2 ( 1 — cos#); q 2 = —2pi ■ p\ = —‘2k 2 (l — cos 6). 

Our expression for the squared matrix element now becomes 

\ E l^l 2 = J^0^{( E +k) 2 HE+kco S O) 2 - m l(l-co S e)). (5.62) 

spins v ' 

To find the cross section from this expression, we use Eq. (4.84), which in 
the case where one particle is massless takes the simple form 


Thus we have our result for unpolarized electron-muon scattering in the 
center-of-mass frame: 

f = 2fc a (E+fc)°(l-cos9) a ((E+OVlE+tco^y-m^l-cosO)), (5.64) 

where k = \/E 2 — m 2 . In the high-energy limit where we can set = 0, the 
differential cross section becomes 

% = ieiS- co 5 oy( 4 + (1 + co3,,)5 )- (5 - 65) 

Note the singular behavior 

^ cx ^ as 8 -> 0 (5.66) 


da \ 
dEl J cm 


\M \ 2 


64 n 2 (E + k) 2 ' 


(5.63) 
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of formulae (5.64) and (5.65). This singularity is the same as in the Rutherford 
formula (Problem 4.4). Such behavior is always present in Coulomb scattering; 
it arises from the nearly on-shell (that is, q 2 ss 0) virtual photon. 

Crossing Symmetry 

The trick we made use of here, namely the relation between the two processes 
e + e _ —>• and e~p~ —> e~p~, is our first example of a type of relation 

known as crossing symmetry. In general, the 5-matrix for any process involv¬ 
ing a particle with momentum p in the initial state is equal to the 5-matrix for 
an otherwise identical process but with an antiparticle of momentum k = —p 
in the final state. That is, 

M (4>(p) +- >■■■) =M( -* ■ ■ ■ + 4>(k)) , (5.67) 

where (j> is the antiparticle of <j> and k = —p. (Note that there is no value of p for 
which p and k are both physically allowed, since the particle must have p° > 0 
and the antiparticle must have k° > 0. So technically, we should say that either 
amplitude can be obtained from the other by analytic continuation.) 

Relation (5.67) follows directly from the Feynman rules. The diagrams 
that contribute to the two amplitudes fall into a natural one-to-one correspon¬ 
dence, where corresponding diagrams differ only by changing the incoming <p 
into the outgoing cp. A typical pair of diagrams looks like this: 


In the first diagram, the momenta g,: coming into the vertex from the rest of 
the diagram must add up to —p, while in the second diagram they must add 
up to k. Thus the two diagrams are equal, except for any possible difference in 
the external leg factors, if p = — k. If <j> is a spin-zero boson, there is no external 
leg factor, so the identity is proved. If <j> is a fermion, the analysis becomes 
more subtle, since the relation depends on the relative phase convention for 
the external spinors u and v. If we simply replace p by —k in the fermion 
polarization sum, we find 

y u(p)u(p) = y+ m = — (ft — m) = — ^ v(k)v(k). (5.68) 

The minus sign can be compensated by changing our phase convention for 
v(k). In practice, it is easiest to cancel by hand one minus sign for each 
crossed fermion. With appropriate conventions for the spinors u(p) and v{k), 
it is possible to prove the identity (5.67) without spin-averaging. 
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Mandelstam Variables 

It is often useful to express scattering amplitudes in terms of variables that 
make it easy to apply crossing relations. For 2-body —> 2-body processes, this 
can be done as follows. Label the four external momenta as 


We now define three new quantities, the Mandelstam variables'. 

s = (p + p') 2 = (k + k') 2 ] 

t = (k — p) 2 = (k 1 — p 1 ) 2 ; (5.69) 

u = (, k 1 — p) 2 = (k — p') 2 . 

The definitions of t and u appear to be interchangeable (by renaming k —> k')\ 
it is conventional to define t as the squared difference of the initial and final 
momenta of the most similar particles. For any process, s is the square of the 
total initial 4-momentum. Note that if we had defined all four momenta to be 
ingoing, all signs in these definitions would be +. 

To illustrate the use of the Mandelstam variables, let us first consider 
the squared amplitude for e + e _ —> p + p~, working in the massless limit for 
simplicity. In this limit we have t = —2 p ■ k = — ‘2p' ■ k! and u = —2 p ■ k 1 = 
—2 p' ■ k, while of course s = (p + p') 2 = q 2 . Referring to our previous result 
(5.10), we find 


3 E i^i 3 

spins 



(5.70) 


To convert to the process e~p~ —> e~p ~, we turn the diagram on its side 
and make use of the crossing relations, which become quite simple in terms 
of Mandelstam variables. For example, the crossing relations tell us to change 
the sign of p ', the positron momentum, and reinterpret it as the momentum 
of the outgoing electron. Therefore s = (p + p') 2 becomes what we would 
now call t, the difference of the outgoing and incoming electron momenta. 
Similarly, t becomes s, while u remains unchanged. Thus for e~p~ —> e~p~, 
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we can immediately write down 


=£[(§)■ + (in- <**> 

spins 


You can easily check that this agrees with (5.61) in the massless limit. Note 
that while (5.70) and (5.71) look quite similar, they are physically very dif¬ 
ferent: The denominator of the first is just s 2 = £y m , but that of the second 
involves 2, which depends on angles and goes to zero as 8 —> 0. 

When a 2-body —> 2-body diagram contains only one virtual particle, it 
is conventional to describe that particle as being in a certain “channel”. The 
channel can be read from the form of the Feynman diagram, and each channel 
leads to a characteristic angular dependence of the cross section: 


.s channel: 


M oc 



2 -channel: 


M oc 


1 

t - m'l 


u -channel: 


M oc- tt 

u - 


In many cases, a single process will receive contributions from more than 
one channel; these must be added coherently. For example, the amplitude for 
Bhabha scattering, e + e~ —> e + e~, is the sum of s- and 2-channel diagrams; 
M0ller scattering, e~e~ —>• e~e~, involves 2- and w-channel diagrams. 

To get a better feel for s, t, and u, let us evaluate them explicitly in the 
center-of-mass frame for particles all of mass m. The kinematics is as usual: 
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Thus the Mandelstam variables are 

s = (p+ P r = m 2 = EL ; 

t = (k — p) 2 = —p 2 sin 2 6 — p 2 (cosd — l) 2 = —2p 2 (l — cos#); (5.72) 

u = ( k' — p) 2 = —p 2 sin 2 6 — p 2 (cos6 + l) 2 = —2p 2 (l + cos#). 

Thus we see that t — > 0 as 0 —> 0, while u—> 0 as 9 —> i r. (When the masses 
are not all equal, the limiting values of t and u will shift slightly.) 

Note from (5.72) that when all four particles have mass m, the sum of 
the Mandelstam variables is s + t + u = 4 E 2 — 4p 2 = 4m' 2 . This is a special 
case of a more general relation, which is often quite useful: 

4 

s + t + u = ^^m 2 , (5.73) 

*=i 

where the sum runs over the four external particles. This identity is easy 
to prove by adding up the terms on the right-hand side of Eqs. (5.69), and 
applying momentum conservation in the form (p + p' — k — k') 2 =0. 


5.5 Compton Scattering 

We now move on to consider a somewhat different QED process: Compton 
scattering , or e _ 7 —> e~ 7 . We will calculate the unpolarized cross section 
for this reaction, to lowest order in a. The calculation will employ all the 
machinery we have developed so far, including the Mandelstam variables of 
the previous section. We will also develop some new technology for dealing 
with external photons. 

This is our first example of a calculation involving two diagrams: 


As usual, the Feynman rules tell us exactly how to write down an expression 
for M. Note that since the fermion portions of the two diagrams are identical, 
there is no relative minus sign between the two terms. Using e v (k) and e* (k 1 ) 
to denote the polarization vectors of the initial and final photons, we have 

iM = u{p'){-ie^ ,l )el{k') {-ie~j v )tv{k)u{p) 

+ u(p'){-ie~f v )e v {k) (-ieY)e*Ak')u(p) 
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= -ie 2 e* (k')e„(k) u(p') 




+ 


[ (p + fc) 2 — m 2 (p — A;') 2 — m 2 J 


u(p). 


We can make a few simplifications before squaring this expression. Since 
p 2 = m 2 and k 2 = 0, the denominators of the propagators are 

(p + k) 2 — m 2 = 2p-k and {p — k') 2 — m 2 = — 2p- k! . 

To simplify the numerators, we use a bit of Dirac algebra: 

(y + m)y"u(p) = (2 p v — Ytf + 7*' m)u(p) 

= 2 p v u{p) — Y — m)u(p) 

= 2 p v u(p). 

Using this trick on the numerator of each propagator, we obtain 


iM = — ie 2 e* (k')e„(k) u(p') 


7 " y , Y +27 » P V -y +27 


2 p-k 


■ + 


-2 p-k' 


u(p). (5.74) 


Photon Polarization Sums 

The next step in the calculation will be to square this expression for M 
and sum (or average) over electron and photon polarization states. The sum 
over electron polarizations can be performed as before, using the identity 
T,u(p)u(p) = y + m. Fortunately, there is a similar trick for summing over 
photon polarization vectors. The correct prescription is to make the replace¬ 
ment 

E (5-75) 

polarizations 

The arrow indicates that this is not an actual equality. Nevertheless, the re¬ 
placement is valid as long as both sides are dotted into the rest of the expres¬ 
sion for a QED amplitude M. 

To derive this formula, let us consider an arbitrary QED process involving 
an external photon with momentum k : 


= iM(k ) = iM f '(k)el(k). (5.76) 


Since the amplitude always contains e* (k), we have extracted this factor and 
defined M^(k) to be the rest of the amplitude M. The cross section will be 
proportional to 

J2K(k)MYk) f = Y J ^MYk)M v *{k). 
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For simplicity, we orient k in the 3-direction: k M = (k, 0,0, k). Then the two 
transverse polarization vectors, over which we are summing, can be chosen to 
be 

e? = (0,1,0,0); £o = (0,0,1,0). 

With these conventions, we have 

= | M\k)\ 2 + \M 2 (k)f. (5.77) 

€ 

Now recall from Chapter 4 that external photons are created by the in¬ 
teraction term j d 4 xej^ where j* = ip^ip is the Dirac vector current. 
Therefore we expect M^(k) to be given by a matrix element of the Heisen¬ 
berg field j 11 : 

M M (k) = Jd 4 x e ik ' x (/| f{x) | i ), (5.78) 

where the initial and final states include all particles except the photon in 
question. 

^From the classical equations of motion, we know that the current j fl is 
conserved: d tl j ,J (x) = 0. Provided that this property still holds in the quantum 
theory, we can dot k fl into (5.78) to obtain 

k^M f ‘(k) = 0. (5.79) 

The amplitude M vanishes when the polarization vector e fl (k) is replaced 
by k fl . This famous relation is known as the Ward identity. It is essentially 
a statement of current conservation, which is a consequence of the gauge 
symmetry (4.6) of QED. We will give a formal proof of the Ward identity in 
Section 7.4, and discuss a number of subtle points skimmed over in this quick 
“derivation”. 

It is useful to check explicitly that the Compton amplitude given in (5.74) 
obeys the Ward identity. To do this, replace e v (k) by k v or e*(fc') by k' fl , and 
manipulate the Dirac matrix products. In either case (after a bit of algebra) 
the terms from the two diagrams cancel each other to give zero. 

Returning to our derivation of the polarization sum formula (5.75), we 
note that for AT = (fc,0,0,fc), the Ward identity takes the form 

kM°(k) — kM 3 (k) = 0. (5.80) 

Thus M° = .M 3 , and we have 

J2^^(k)M v *(k) = | At 1 ! 2 + \M 2 \ 2 

= \M 1 \ 2 + \M 2 \ 2 + \M 3 \ 2 -\M°\ 2 
= -g» v M x {k)M v *{k). 

That is, we may sum over external photon polarizations by replacing )T e* 
with -g^. 
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Note that this proves (pending our general proof of the Ward identity) 
that the unphysical timelike and longitudinal photons can be consistently 
omitted from QED calculations, since in any event the squared amplitudes 
for producing these states cancel to give zero total probability. The negative 
norm of the timelike photon state, a property that troubled us in the discussion 
after Eq. (4.132), plays an essential role in this cancellation. 


The Klein-Nishina Formula 


The rest of the computation of the Compton scattering cross section is 
straightforward, although it helps to be somewhat organized. We want to 
average the squared amplitude over the initial electron and photon polariza¬ 
tions, and sum over the final electron and photon polarizations. Starting with 
expression (5.74) for M, we find 


\ Y I ' M ! 2 = -tr 


spins 


j(/+m)[ 

• (y+m) [ 


Y #7"+27 tJ p v Y ¥'Y -2y V‘ 


2 p-k 


+ 


2 p-k' 


Y¥Y+2j p p a Y¥'Y- 


+ 


n 


+ 


2 p-k 

III 


+ 


Z-y° 2'V-1 
2 p-k' J J 


+ 


IV 


(2 p-k) 2 (2p-k)(2p-k') (2p-k')(2p-k) (2 p-k')' 2 


(5.81) 


where I, II, III, and IV are complicated traces. Note that IV is the same 
as I if we replace k with — k'. Also, since we can reverse the order of the 7 
matrices inside a trace (Eq. (5.7)), we see that II = III. Thus we must work 
only to compute I and II. 

The first of the traces is 


I = tr [(/ + m)( Y¥l v + 27 V') (y + m)( 7 „# 7 „ + 27 M p„)]. 

There are 16 terms inside the trace, but half contain an odd number of 7 
matrices and therefore vanish. We must now evaluate the other eight terms, 
one at a time. For example, 

tr[/Y¥lflfav¥ln] = tr[(- 2/)^(—2 fl)¥] 

= tr[4y'^(2 p-k — ¥'¥)\ 

= 8 p-k tr[/¥\ 

= 32 (p-k)(p'-k). 

By similar use of the contraction identities (5.8) and (5.9), and other Dirac 
algebra such as = p 2 = m' 2 , each term in I can be reduced to a trace of no 
more than two 7 matrices. When the smoke clears, we find 

I = 16 (4m 4 — 2 m 2 p-p' + Am 2 p-k — 2 m 2 p' -k + 2{p- k)(p' ■ k)) . (5.82) 
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Although it is not obvious, this expression can be simplified further. To 
see how, introduce the Mandelstam variables: 

s = (p + k) 2 = 2p-k + m 2 = 2p' ■ k' + m 2 ; 

t = (p -p) 2 = -2p-p +2m 2 = -2k-k'i (5.83) 

u = ( k 1 — p) 2 = —2 k'-p + m 2 = —2k-p' + m 2 . 


Recall from (5.73) that momentum conservation implies s + t+u = 2to 2 . Writ¬ 
ing everything in terms of s, t, and u, and using this identity, we eventually 
obtain 


I = 16(2m 4 + m 2 (s — m 2 ) — i(s — m 2 )(u — m 2 )). (5.84) 

Sending k -f-> —k', we can immediately write 

IV = 16(2m 4 +m 2 (u — m 2 ) — 4(s — m 2 )(u — to 2 )). (5.85) 

Evaluating the traces in the numerators II and III requires about the same 
amount of work as we have just done. The answer is 


II = III = —8(4to 4 + m 2 (s — to 2 ) + m 2 {u — to 2 )) . 


(5.86) 


Putting together the pieces of the squared matrix element (5.81), and rewriting 
s and u in terms of p • k and p ■ k ', we finally obtain 


i £ \ M ? = 2e d£A + £i +2mS (± _ ' wO. ' f 

4 t-r 1 [ P'k P'k \p-k p-k' / \p-k p-k' > 


. (5.87) 


To turn this expression into a cross section we must decide on a frame of 
reference and draw a picture of the kinematics. Compton scattering is most 
often analyzed in the “lab” frame, in which the electron is initially at rest: 


We will express the cross section in terms of lu and 0. We can find co', the 
energy of the final photon, using the following trick: 

to 2 = ( p') 2 = (p + k - k'f = p 2 + 2p ■ (k - k 1 ) -2k -k' 

= to 2 -f 2 m{io — uj') — 2u>ui'{\ — cos 6), 


hence, 


(5.88) 
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The last line is Compton’s formula for the shift in the photon wavelength. For 
our purposes, however, it is more useful to solve for to'-. 

-. (5.89) 


1 + — (1 — cos#) 
m 


The phase space integral in this frame is 


/*•= w*”# + p ' -*-*> 

[{io')-dJdn 1 


/ 


(2n) 3 4 u'E' 

x 27r 8{to' + \/7ri 2 +uj' 2 + (uj')' 2 —2ujuj' cos# — to — m) 
d cos# to' 1 


2ir 4 E' 


1 + 


U)' — U) COS# 


E’ 


hj 

hS 


d cos# 

d cos# 


uJ 


m + w(l — cos#) 

Ml. 

com 


(5.90) 


Plugging everything into our general cross-section formula (4.79) and setting 
V’a — vb | = 1, we find 


da 


1 1 1 (to 1 ) 2 


d cos # 2 to 2m 8ir tom 


(jEi^i 1 )- 


To evaluate \M\ 2 , we replace p ■ k = mu> and p ■ k' = mu>' in (5.87). The 
shortest way to write the final result is 

.. ' ' (5.91) 


da 


d cos# m 2 


fio '\ 2 

to to . 9 . 

— 

-1-sin' # 

\ LO / 

(jj or 


where uj'/u is given by (5.89). This is the (spin-averaged) Klein-Nishina for¬ 
mula, first derived in 1929.1 

In the limit to —> 0 we see from (5.89) that oj'/uj —> 1, so the cross section 
becomes 


da 

d cos# 


O 

'ROT 

m 2 


(1 + cos 2 #); 


87rcr 

^total — o” • 

6m z 


(5.92) 


This is the familiar Thomson cross section for scattering of classical electro¬ 
magnetic radiation by a free electron. 


tO. Klein and Y. Nishina, Z. Physik, 52 , 853 (1929). 
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High-Energy Behavior 

To analyze the high-energy behavior of the Compton scattering cross section, 
it is easiest to work in the center-of-mass frame. We can easily construct the 
differential cross section in this frame from the invariant expression (5.87). 
The kinematics of the reaction now looks like this: 


Plugging these values into (5.87), we see that for 8 ss 7 r, the term p-k/p-k' 
becomes very large, while the other terms are all of 0(1) or smaller. Thus for 
E m and 8 ss 7 r, we have 


Jem’ 

spins 


2 e 4 


p-k 


= 2 e 4 




p-k 1 E + to cos 8 

The cross section in the CM frame is given by (4.84): 


dc r _ 1 1 1 to 2 e 4 (E+co) 

d cos8 2 2 E 2 to (27t)4 (E + to) E +cocos8 

' 2na 2 


2 m 2 + s(l + cos 8)' 


(5.93) 


(5.94) 


Notice that, since s > m 2 , the denominator of (5.94) almost vanishes 
when the photon is emitted in the backward direction (8 ss i r). In fact, the 
electron mass m could be neglected completely in this formula if it were not 
necessary to cut off this singularity. To integrate over cos#, we can drop the 
electron mass term if we supply an equivalent cutoff near 8 = it. In this way, 
we can approximate the total Compton scattering cross section by 


1 

J d(cos8) 

-l 


da 


‘2itcc 


d cos 8 


1 

j d(cos8) 


— 1+2 m 2 /s 


(1 + cos 8) 


(5.95) 


Thus, we find that the total cross section behaves at high energy as 

‘Itcoc , / s \ 

fftotai =- log (— j • (5.96) 

s \ m- 1 

The main dependence a 2 /s follows from dimensional analysis. But the singu¬ 
larity associated with backward scattering of photons leads to an enhancement 
by an extra logarithm of the energy. 
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Let us try to understand the physics of this singularity. The singular term 
comes from the square of the u-channel diagram, 


= — le e, 


,(k)el(k')u(p')Y' 


p — Ijt + m 
(p — k')' 2 — m 


-7"w(p). (5.97) 


The amplitude is large at 6 k, n because the denominator of the propagator 
is then small (~m 2 ) compared to s. To be more precise, define y = 7 t — 0. We 
will be interested in values of y that are somewhat larger than m/ui, but still 
small enough that we can approximate 1 — cosy ss y 2 /2. For y in this range, 
the denominator is 

2 

(p — k') 2 — m 2 = — 2p-k' « — 2cj 2 + 1 — cos yj « — (w 2 y 2 +m 2 ). (5.98) 

This is small compared to s over a wide range of values for y, hence the 
enhancement in the total cross section. 

Looking back at (5.93), we see that for y such that m/u) « y < 1, the 
squared amplitude is proportional to 1/y 2 , and hence we expect M cx 1/y. 
But we have just seen that the denominator of M is proportional to y 2 , so 
there must be a compensating factor of y in the numerator. We can understand 
the physical origin of that factor by looking at the amplitude for a particular 
set of electron and photon polarizations. 

Suppose that the initial electron is right-handed. The dominant term of 
(5.97) comes from the term that involves (pi — I/ 1 ) in the numerator of the 
propagator. Since this term contains three -^matrices in (5.97) between the 
u and the u, the final electron must also be right-handed. The amplitude is 
therefore 

iM = -ie 2 eu(k)el(k')u i R (p')a t ‘ ° ^ ^ <j v u R (p), (5.99) 

—(ury“+mj 

where 

«fl(p) = V^P and u B (p') = V^Q- (5-100) 

If the initial photon is left-handed, with e^k) = (l/\/2)(0,1, —i, 0), then 



and the combination u^ R (p')a tl e fl (k) vanishes. The initial photon must there¬ 
fore be right-handed. Similarly, the amplitude vanishes unless the final photon 
is right-handed. The kinematic situation for this set of polarizations is shown 
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Figure 5.6. In the high-energy limit, the final photon is most likely to be 
emitted at backward angles. Since lielicity is conserved, a unit of spin angular 
momentum is converted to orbital angular momentum. 


in Fig. 5.6. Note that the total spin angular momentum of the final state is 
one unit less than that of the initial state. 

Continuing with our calculation, let us consider the numerator of the 
propagator in (5.99). For y i n the range of interest, the dominant term is 

-cr 1 {p-k') i = a 1 -iox- 


This is the factor of x anticipated above. It indicates that the final state is 
a p- wave, as required by angular momentum conservaton. Assembling all the 
pieces, we obtain 


M(e R j R —I e R 7 fl ) 


(5.101) 

We would find the same result in the case where all initial and final particles 
are left-handed. 

Notice that for directly backward scattering, x = 0. the matrix element 
(5.101) vanishes due to the angular momentum zero in the numerator. Thus, 
at angles very close to backward, we should also take into account the mass 
term in the numerator of the propagator in (5.97). This term contains only two 
gamma matrices and so converts a right-handed electron into a left-handed 
electron. By an analysis similar to the one that led to Eq. (5.101), we can 
see that this amplitude is nonvanishing only when the initial photon is left- 
handed and the final photon is right-handed. Following this analysis in more 
detail, we find 


4 e 2 m/to 


M(i R j,. > e L x r) 


X 2 + m 2 /u > 2 


(5.102) 



5.5 Compton Scattering 167 


The reaction with all four helicities reversed gives the same matrix element. 

To compare this result to our previous calculations, we should add the 
contributions to the cross section from (5.101) and (5.102) and equal con¬ 
tributions for the reactions involving initial left-handed electrons, and divide 
by 4 to average over initial spins. The unpolarized differential cross section 
should then be 


da 

d cos 6 


111 to T 8e 4 y 2 8 e 4 m 2 /or 

2 ~2E 2uj {2n)4{E + ui) [(.y 2 + m 2 /uj 2 ) 2 + (y 2 +m 2 /u; 2 ) 2 


47ra 2 

s(y 2 + 4m' 2 /s) ’ 


(5.103) 


which agrees precisely with Eq. (5.94). 

The importance of the helicity-flip process (5.102) just at the kinematic 
endpoint has an interesting experimental consequence. Consider the process 
of inverse Compton scattering, a high-energy electron beam colliding with 
a low-energy photon beam (for example, a laser beam) to produce a high- 
energy photon beam. Let the electrons have energy E and the laser photons 
have energy zo, let the energy of the scattered photon be E' = yE, and 
assume for simplicity that s = 4Evo m 2 . Then the computation we have 
just done applies to this situation, with the highest energy photons resulting 
from scattering that is precisely backward in the center-of-mass frame. By 
computing 2 k ■ k' in the center-of-mass frame and in the lab frame, it is easy 
to show that the final photon energy is related to the center-of-mass scattering 
angle through 

V ~ ^(1 - cos(9) « 1 - 


Then Eq. (5.103) can be rewritten as a formula for the energy distribution of 
backscattered photons near the endpoint: 


da 

dy 


27TCT 


s((l— y) + m 2 /s) 2 


(! -y) + 


m~ 


(5.104) 


where the first term in brackets corresponds to the helicity-conserving pro¬ 
cess and the second term to the helicity-flip process. Thus, for example, if 
a right-handed polarized laser beam is scattered from an unpolarized high- 
energy electron beam, most of the backscattered photons will be right-handed 
but the highest-energy photons will be left-handed. This effect can be used 
experimentally to measure the polarization of an electron beam or to create 
high-energy photon sources with adjustable energy distribution and polariza¬ 
tion. 
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Pair Annihilation into Photons 

We can still obtain one more result from the Compton-scattering amplitude. 
Consider the annihilation process 

e + e _ —>2y, 

given to lowest order by the diagrams 


This process is related to Compton scattering by crossing symmetry; we can 
obtain the correct amplitude from the Compton amplitude by making the 
replacements 


p —> pi p' —> —p2 k —> —k\ k 1 —>• k-2- 

Making these substitutions in (5.87), we find 


7 £ i-mi 5 = 


Pi-k-2 , pi-ki 


+ 


pi-ki pi-k-i 


+ 


m (prfei + Prfc 2 ) 


(5.105) 


The overall minus sign is the result of the crossing relation (5.68) and should 
be removed. 

Now specialize to the center-of-mass frame. The kinematics is 


A routine calculation yields the differential cross section, 

da ‘2na 2 / E \ E 2 + p 2 cos 2 9 2 m 2 2m 4 

d cos 9 s \p) m 2 + p 2 sin 2 9 mr + p 2 sin 2 9 (m 2 + p 2 sin 2 9) 2 

(5.106) 

In the high-energy limit, this becomes 

da 27rar /1 + cos 2 9 \ 

d t-os9 /:>m s V sin 2 # / 


(5.107) 
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Figure 5.7. Angular dependence of the cross section for e + e - — > 2y at 
E cm = 29 GeV, as measured by the HRS collaboration, M. Derrick, et. ah, 
Phvs. Rev. D34, 3286 (1986). The solid line is the lowest-order theoretical 
prediction, Eq. (5.107). 

except when sin# is of order m/p or smaller. Note that since the two photons 
are identical, we count all possible final states by integrating only over 0 < 
# < 7 t/ 2. Thus the total cross section is computed as 

l 

fftotai = [ d(cos 9) da (5.108) 

J a cos# 

o 

Figure 5.7 compares the asymptotic formula (5.107) for the differential 
cross section to measurements of e + e _ annihilation into two photons at very 
high energy. 


Problems 

5.1 Coulomb scattering. Repeat the computation of Problem 4.4, part (c), this 
time using the full relativistic expression for the matrix element. You should find, for 
the spin-averaged cross section, 

do_ a 2 _ ( _ , 2 • 2 

dn 4|p|2/32sin 4 (#/2)A ' Sm 2 )' 

where p is the electron’s 3-momentum and /3 is its velocity. This is the Mott formula for 
Coulomb scattering of relativistic electrons. Now derive it in a second way, by working 
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out the cross section for electron-muon scattering, in the muon rest frame, retaining 
the electron mass but sending — 5- oo. 


5.2 Bhabha scattering. Compute the differential cross section do / d cos# for 
Bhabha scattering, e + e“ —> e + e~. You may work in the limit E cm m e , in which 
it is permissible to ignore the electron mass. There are two Feynman diagrams; these 
must be added in the invariant matrix element before squaring. Be sure that you have 
the correct relative sign between these diagrams. The intermediate steps are compli¬ 
cated, but the final result is quite simple. In particular, you may find it useful to 
introduce the Mandelstam variables s, f, and u. Note that, if we ignore the electron 
mass, s + t + u = 0. You should be able to cast the differential cross section into the 
form 


do 

dcos6 


ncr 

s 


KM) ! +0+(i)1 


Rewrite this formula in terms of cos 9 and graph it. What feature of the diagrams 
causes the differential cross section to diverge as 9 —> 0? 


5.3 The spinor product formalism introduced in Problem 3.3 provides an efficient 
way to compute tree diagrams involving massless particles. Recall that in Problem 3.3 
we defined spinor products as follows: Let up o, uro be the left- and right-handed 
spinors at some fixed liglitlike momentum ko- These satisfy 

' 1 — y 5 ' /I l~,5n 


UL0 U L0 = 


mi 


UROURO = 




(These relations are just the projections onto definite lielicity of the more standard 
formula uquo = V o-) Then define spinors for any other liglitlike momentum p by 


ul(p) 


1 

V 2 P • ko 


RO, 


ur(p) 


1 

V 2 P ■ A'O 


■Alo- 


( 2 ) 


We showed that these spinors satisfy fhi(p) = 0; because there is no m around, they 
can be used as spinors for either fermions or antifermions. We defined 


s(pi,P2) = u R (p 1 )u L (p2), t.(pi,p 2 ) = Ul.ip] ■■llR‘.p- 2 ). 

and, in a special frame, we proved the properties 

t(PUP2) = (s(P2,Pi))*, s(pl,P 2 ) = s(p2,Plh |s(Pl»P2)r = 2 P1 * P‘ 2 * (3) 

Now let us apply these results. 

(a) To warm up, give another proof of the last relation in Eq. (3) by using (1) to 
rewrite \s(pi,p 2 )\ 2 as a trace of Dirac matrices, and then applying the trace 
calculus. 

(b) Show that, for any string of Dirac matrices, 

trpy^S v ') (> • • •] = tr[- • • 

where p, v,p,... = 0,1, 2, 3, or 5. Use this identity to show that 
“/.•/'I Id P2 ■ = :y1 }. 


(c) Prove the Fierz identity 

«/.•/'! •‘•''"/.•/'•j) h/x\ ab = 2 [UL(P2)UL{P1) + UR(pi)UR(p2)\ ab , 
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where a, b = 1,2,3,4 are Dirac indices. This can be done by justifying the 
following statements: The right-hand side of this equation is a Dirac matrix; 
thus, it can be written as a linear combination of the 16 F matrices discussed in 
Section 3.4. It satisfies 

7 5 [M] = [A/y. 

thus, it must have the form 

[M] = + (^r-)7^ 

where V M and are 4-vectors. These 4-vectors can be computed by trace 
technology; for example, 

v " = h I ^{-2 -)"]■ 

Consider the process e + e —t fi + L l , to the leading order in «, ignoring the 
masses of both the electron and the muon. Consider first the case in which the 
electron and the final muon are both right-handed and the positron and the 
final antimuon are both left-handed. (Use the spinor vr for the antimuon and 
Ur for the positron.) Apply the Fierz identity to show that the amplitude can 
be evaluated directly in terms of spinor products. Square the amplitude and 
reproduce the result for 

given in Eq. (5.22). Compute the other lielicity cross sections for this process 
and show that they also reproduce the results found in Section 5.2. 

(e) Compute the differential cross section for Bhabha scattering of massless elec¬ 
trons, helicity state by lielicity state, using the spinor product formalism. The 
average over initial lielicities, summed over final helicities, should reproduce the 
result of Problem 5.2. In the process, you should see how this result arises as 
the sum of definite-helicity contributions. 

5.4 Positronium lifetimes. 

(a) Compute the amplitude M for e+e _ annihilation into 2 photons in the extreme 
nonrelativistic limit (i.e., keep only the term proportional to zero powers of the 
electron and positron 3-momentum). Use this result, together with our formal¬ 
ism for fermion-antifermion bound states, to compute the rate of annihilation 
of the 15 states of positronium into 2 photons. You should find that the spin-1 
states of positronium do not annihilate into 2 photons, confirming the symme¬ 
try argument of Problem 3.8. For the spin-0 state of positronium, you should 
find a result proportional to the square of the 15 wavefunction at the origin. In¬ 
serting the value of this wavefunction from nonrelativistic quantum mechanics, 
you should find 

I _ p _ a ° m e 

T 2 


8.03 x 10 9 sec -1 . 
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A recent measurement* gives T = 7.994 ± .011 nsec 1 ; the 0.5% discrepancy is 
accounted for by radiative corrections. 

(b) Computing the decay rates of liigher-1 positronium states is somewhat more 
difficult; in the rest of this problem, we will consider the case l = 1. First, work 
out the terms in the e + e“ —5- 27 amplitude proportional to one power of the 
3-momentum. (For simplicity, work in the center-of-mass frame.) Since 


——3 P Z $\p) = i— 


(2tt 


dx 




X—0 


this piece of the amplitude has overlap with P -wave bound states. Show that 
the 5=1, but not the 5 = 0 states, can decay to 2 photons. Again, this is a 
consequence of C. 

(c) To compute the decay rates of these P -wave states, we need properly normalized 
state vectors. Denote the three P-state wavefunctions by 


i'i = M- 


/ 


normalized to / d 3 x , 0*(x)'0j(x) = 5, t 


iji 


and their Fourier transforms by ^(p). Show that 

|B(k)) = V2M J ^, Mp)Q ; +k/ 2 E* 6 t _ p+k/2 0 ) 


is a properly normalized bound-state vector if E* denotes a set of three 2x2 
matrices normalized to 

y^tr(E' :t E' : ) = 1. 

i 


To build 5=1 states, we should take each E® to contain a Pauli sigma matrix. 
In general, spin-orbit coupling will split the multiplet of 5 = 1, L = 1 states 
according to the total angular momentum J. The states of definite 7 are given 
by 



.7=1: 5? = ^e ijk n J a k , 

7 = 2: E® = -L/C'V, 

V3 

where 11 is a polarization vector satisfying n | 2 = 1 and h 11 is a traceless tensor, 
for which a typical value might be h 12 = 1 and all other components zero. 

(d) Using the expanded form for the e + e —2y amplitude derived in part (b) and 
the explicit form of the 5 = 1, L = 1, definite-7 positronium states found in 
part (c), compute, for each 7, the decay rate of the state into two photons. 


*D. W. Gidlev et. ah, Phvs. Rev. Lett. 49, 525 (1982). 
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5.5 Physics of a massive vector boson. Add to QED a massive plioton field B^ 
of mass M, which couples to electrons via 

AH = jd 3 x 4’Bf,). 

A massive photon in the initial or final state has three possible physical polarizations, 
corresponding to the three spacelike unit vectors in the boson’s rest frame. These can 
be characterized invariantly, in terms of the boson’s 4-momentum A # ‘, as the three 
vectors eft satisfying 

M . e u) = k . e m = (L 

The four vectors ( k fl /M , eft ) form a complete orthonormal basis. Because B fi couples 
to the conserved current the Ward identity implies that A M dotted into the 

amplitude for B production gives zero; thus we can replace: 

eft eft* -t -Qnv 
i 

This gives a generalization to massive bosons of the Feynman trick for photon polar¬ 
ization vectors and simplifies the calculation of B production cross sections. (Warning: 
This trick does not work (so simply) for “non-Abelian gauge fields”.) Let’s do a few 
of these computations, using always the approximation of ignoring the mass of the 
electron. 

(a) Compute the cross section for the process e+e - —» B. Compute the lifetime of 
the B, assuming that it decays only to electrons. Verify the relation 

a(e Jr e~ —> B) = ~ F(_B —> e + e“ )S(M 2 — s) 
discussed in Section 5.3. 

(b) Compute the differential cross section, in the center-of-mass system, for the 
process e+e _ —¥ 7 + B. (This calculation goes over almost unchanged to the 
realistic process e + e - —> 7 + Z °; this allows one to measure the number of 
decays of the Z° into unobserved final states, which is in turn proportional to 
the number of neutrino species.) 

(c) Notice that the cross section of part (b) diverges as 9 —> 0 or n. Let us analyze 
the region near 9 = 0. In this region, the dominant contribution comes from 
the f-channel diagram and corresponds intuitively to the emission of a photon 
from the electron line before e^e~ annihilation into a B. Let us rearrange the 
formula in such a way as to support this interpretation. First, note that the 
divergence as 9 —^ 0 is cut off by the electron mass: Let the electron momentum 
be pv = (E, 0, 0, A’), with k = ( E 2 — ml) 1 /' 2 , and let the photon momentum be 
A M = (xE,xEsin9,0,xEcos9). Show that the denominator of the propagator 
then never becomes smaller than 0{m 2 e /s). Now integrate the cross section of 
part (b) over forward angles, cutting off the 9 integral at 9 2 ~ (ml/s) and 
keeping only the leading logarithmic term, proportional to log (s/ml). Show 
that, in this approximation, the cross section for forward photon emission can 
be written 

<r(e + e _ — ¥ 7 + B) ps I dx f(x) ■ a(e^e~ —> B at E/ w = (l—x)s), 
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where the annihilation cross section is evaluated for the collision of a positron 
of energy E and an electron of energy (1 — x)E, and the function /(&•), the 
Weiszacker-Williams distribution function, is given by 


fi x ) 


a 1 + (1 — x ) 2 
2n x 



This function arises universally in processes in which a photon is emitted 
collinearly from an electron line, independent of the subsequent dynamics. We 
will meet it again, in another context, in Problem 6.2. 


5.6 This problem extends the spinor product technology of Problem 5.3 to external 
photons. 

(a) Let k be the momentum of a photon, and let p be another lightlike vector, chosen 
so that p-k yf 0. Let u R (p), up(p) be spinors of definite helicity for fermions with 
the liglitlike momentum p, defined according to the conventions of Problem 5.3. 
Define photon polarization vectors as follows: 

e+(k) = -^=u R (k)^UR(p), £—(k) = -^=u L (k)^u L (p). 

Use the identity 


to compute the polarization sum 




kUpU + pupu 

p ■ k 


The second term on the right gives zero when dotted with any photon emission 
amplitude Af M , so we have 

| e+-Mf+ \e--M\' 2 =M»M I '*(-g tiv )'i 


thus, we can use the vectors e_|_, e_ to compute photon polarization sums. 

(b) Using the polarization vectors just defined, and the spinor products and the Fierz 
identity from Problem 5.3, compute the differential cross section for a massless 
electron and positron to annihilate into 2 photons. Show that the result agrees 
with the massless limit derived in (5.107): 

da 27 m 2 / 1 + cos 2 6 \ 

dcosO s V sin 2 9 J 

in the center-of-mass frame. It follows from the result of part (a) that this answer 
is independent of the particular vector p used to define the polarization vectors; 
however, the calculation is greatly simplified by taking this vector to be the 
initial electron 4-vector. 



Chapter 6 


Radiative Corrections: Introduction 


Now that we have acquired some experience at performing QED calculations, 
let us move on to some more complicated problems. Chapter 5 dealt only with 
tree-level processes, that is, with diagrams that contain no loops. But all such 
processes receive higher-order contributions, known as radiative corrections, 
from diagrams that do contain loops. Another source of radiative corrections 
in QED is bremsstrahlung, the emission of extra final-state photons during a 
reaction. In this chapter we will investigate both types of radiative corrections, 
and find that it is inconsistent to include one without also including the other. 

Throughout this chapter, in order to illustrate these ideas in the simplest 
possible context, we will consider the process of electron scattering from an¬ 
other, very heavy, particle. We analyzed this process at tree level in Section 5.4 
and Problem 5.1. At the next order in perturbation theory, we encounter the 
following four diagrams: 


( 6 . 1 ) 


The order-a correction to the cross section comes from the interference term 
between these diagrams and the tree-level diagram. There are six additional 
one-loop diagrams involving the heavy particle in the loop, but they can be 
neglected in the limit where that particle is much heavier than the electron, 
since the mass appears in the denominator of the propagator. (Physically, 
the heavy particle accelerates less, and therefore radiates less, during the 
collision.) 

Of the four diagrams in (6.1), the first (known as the vertex correction ) is 
the most intricate and gives the largest variety of new effects. For example, it 
gives rise to an anomalous magnetic moment for the electron, which we will 
compute in Section 6.3. 

The next two diagrams of (6.1) are external leg corrections. We will neglect 
them in this chapter because they are not amputated, as required by our 
formula (4.90) for S'-matrix elements. We will discuss these diagrams in more 
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detail when we prove that formula in Section 7.2. 

The final diagram of (6.1) is called the vacuum polarization. Since it re¬ 
quires more computational machinery than the others, we will not evaluate 
this diagram until Section 7.5. 

Our study of these corrections will be complicated by the fact that they 
are ill-defined. Each diagram of (6.1) involves an integration over the unde¬ 
termined loop momentum, and in each case the integral is divergent in the 
k —t oo or ultraviolet region. Fortunately, the infinite parts of these integrals 
will always cancel out of expressions for observable quantities such as cross 
sections. 

The first three diagrams of (6.1) also contain infrared divergences: infini¬ 
ties coming from the k —> 0 end of the loop-momentum integrals. We will see 
in Section 6.4 that these divergences are canceled when we also include the 
following bremsstrahlung diagrams: 


( 6 . 2 ) 


These diagrams are divergent in the limit where the energy of the radiated 
photon tends to zero. In this limit, the photon cannot be observed by any 
physical detector, so it makes sense to add the cross section for producing these 
low-energy photons to the cross section for scattering without radiation. The 
bremsstrahlung diagrams are thus an essential part of the radiative correction, 
in this and any other QED process. 

Our main goals in the present chapter are to understand bremsstrahlung 
of low-energy photons, the vertex correction diagram, and the cancellation of 
infrared divergences between these two types of radiative corrections. 

6.1 Soft Bremsstrahlung 

Let us begin our study of radiative corrections by analyzing the bremsstrah¬ 
lung process. In this section we will first do a classical computation of the 
intensity of the low-frequency bremsstrahlung radiation when an electron un¬ 
dergoes a sudden acceleration. We will then compute a closely related quantity 
in quantum field theory: the cross section for emission of one very soft pho¬ 
ton, given by diagrams (6.2). We would like to understand how the classical 
result arises as a limiting case of the quantum result. 
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Classical Computation 

Suppose that a classical electron receives a sudden kick at time t = 0 and 
position x = 0, causing its 4-momentum to change from p to p'. (An in¬ 
finitely sudden change of momentum is of course an unrealistic idealization. 
The precise form of the trajectory during the acceleration does not affect the 
low-frequency radiation, however. Our calculation will be valid for radiation 
with a frequency less than the reciprocal of the scattering time.) 


sudden kick at time t = 0, 
when particle is at x = 0 


We can find the radiation field by writing down the current of this electron, 
and considering that current as a source for Maxwell’s equations. 

What is the current density of such a particle? For a charged particle at 
rest at x = 0, the current would be 

/(.t) = (1,0 )"-eJ‘ 3 >(x) 


= Jdt( 1,0) M ■ eS {4) (x - y(t)), 


with y»(t) = (t, Of. 


From this we can guess the current for an arbitrary trajectory y 1 '' (t) : 

fix) = e J (h dV ^ <5 (4) [x - y(r)). (6.3) 

Note that this expression is independent of the precise way in which the 
curve y M (r) is parametrized: Changing variables from r to a(r) gives a factor 
of dr/da in the integration measure, which combines with dy^/d.r via the 
chain rule to give dy 11 /da. We can also prove from (6.3) that the current is 
automatically conserved: For any “test function” f(x) that falls off at infinity, 
we have 


Jd 4 x f{x)d ll f{x) = Jd 4 x f{x) ejdr <V (4) (x - y(r)) 

0 J d . df(r) d 


= —e / dr 


dr dxv 


fix) 


= ~ e J dT /lff(y( T )) 

= -ef{y{r)) 


= 0 . 


For our process the trajectory is 


f (p # ' /m)r 
\ (p lfI /m)r 


for t < 0; 
for r > 0. 
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Thus the current can be written 


f(x)=e [ dT Pls^( X -P-r) 

J m \ m / 
o 



In a moment we will need to know the Fourier transform of this function. 
Inserting factors of e~ €T and e €T to make the integrals converge, we have 


f(k) = J d i xe ,k - x j f '(x) 

oo 0 

= e [dr 2l e i(kp'/m+ie)T + e f dT P^_ e i(kp/r 

J rn J m 



p'v 

k ■ p' + ie 


P tl \ 
k ■ p — ieJ 


(6.4) 


We are now ready to solve Maxwell’s equations. In Lorentz gauge (d fl A /t = 
0) we must solve = j 1 ', or in Fourier space, 

>i/C = -^re¬ 

plugging in (6.4), we obtain a formula for the vector potential: 


Ak(x) 


/' ik . x —ie 

1 P' k 

pk \ 

J (2t r) 4 k 2 1 

\k ■ p' + ie 

1 

1 

<!*. 

ct\ 


(6.5) 


The k° integral can be performed as a contour integral in the complex plane. 
The locations of the poles are as follows: 


We place the poles at k° = ±|k| below the real axis so that (as we shall soon 
confirm) the radiation field will satisfy retarded boundary conditions. 

For t < 0 we close the contour upward, picking up the pole at k ■ p = 0, 
that is, k° = k • p /p°. The result is 
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In the reference frame where the particle is initially at rest, its momentum 
vector is jf = (p°,0) and the vector potential reduces to 

This is just the Coulomb potential of an unaccelerated charge. As we would 
expect, there is no radiation field before the particle is scattered. 

After scattering (t > 0), we close the contour downward, picking up the 
three poles below the real axis. The pole at k° = k • p 1 /p'° gives the Coulomb 
potential of the outgoing particle. Thus the other two poles are completely 
responsible for the radiation field. Their contribution gives 



where the momentum-space amplitude »4(k) is given by 


„4"(k) 


—e / p'v p 1 ' \ 
|k| \k ■ p 1 k ■ p) 


(6.7) 


(The condition k° = |k| is implicit here and in the rest of this calculation.) 

To calculate the energy radiated, we must find the electric and magnetic 
fields. It is easiest to write E and B as the real parts of complex Fourier 
integrals, just as we did for 


/ , r ]3u 

/ .r] 3 U 


The momentum-space amplitudes f (k) and B( k) of the radiation fields are 
then simply 


f(k) = -ikA°(k) + ik°A(k); 
&{k) =ikx A{k) = kx £{k). 


(6.9) 


Using the explicit form (6.7) of *4 M (k), you can easily check that the electric 
field is transverse: k • £(k) = 0. 

Having expressed the fields in this way, we can compute the energy radi¬ 
ated: 

= ±Jd 3 : c(|E(.t)| 2 + |B(.t)| 2 ). 


Energy 


( 6 . 10 ) 
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The first term is 

| Jd 3 xJ (0/(£(k)e-^' + £*(k)e ikx ) ■ (s(k')e- ik '* + £*(k')e ik '*) 

= | J-0 (f (k) • £(-k)e~' 2ik0t + 2£(k)-£*(k) + £*(k) • £*(-k)e 2 ^ 4 ). 

A similar expression involving B(k) holds for the second term. Using (6.9) 
and the fact that £(k) is transverse, you can show that the time-dependent 
terms cancel between £ and B. while the remaining terms add to give 

Energy = f J0^ £(k) ■ £*(k). (6.11) 

Since £(k) is transverse, let us introduce two transverse unit polarization 
vectors e A (k), A = 1,2. We can then write the integrand as 

£(k) • £*(k) = £ |e A (k) • £(k)| 2 = |k| 2 J] |e A (k) • A(k)f. 

A—1,2 A—1,2 


Using the explicit form of .A(k) (6.7), we finally arrive at an expression for 
the energy radiated*: 


_, i d 3 k 

Ener gy = / ToZyf 


Ev 


(2tt) 3 ^ 2 

v ’ A—1,2 


k-p ) 


( 6 . 12 ) 


We can freely change e, p', and p into 4-vectors in this expression. Then, 
noting that substituting k M for F would give zero, 


P 


h‘ 


pf 


^k • p' k-pt 


= 0, 


we find that we can perform the sum over polarizations using the trick of 
Section 5.5, replacing by —g [iv - Our result then becomes 


Energy = 


d 3 k e 2 / p'» pP \ / p' v p v \ 

(27t) 3 2 \k ■ p' k-p)\k-pf k-p) 


d 3 k e 2 

(2tt) 3 t 


2 P ■ p' 


m 


m 


(k ■ p')(k ■ p) (k-p 1 ) 2 (k-p) 2 


(6.13) 


To make this formula more explicit, choose a frame in which p° = p'° = E. 
Then the momenta are 


F = (k, k), pf* = E( 1, v), p'" = E( l,v'). 


*This result is also derived in Jackson (1975), p. 703. 
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In such a frame our formula becomes 

E " crey = m- 


■ J dk X(v, v'), 


(6.14) 


where X(v,v') (which is essentially the differential intensity d(Energy)/dfc) is 
given by 


x , = f^h ( 2 ( i-vV) _ e i 

J 4tt \(1—A-v)(1 —A-v') (l—k-yr 1 ) 2 


m 2 / E 2 \ 
(1 —k ■ v) 2 / 


(6.15) 


Since I(v, v') does not depend on k, we see that the integral over k in (6.14) 
is trivial but divergent. This divergence comes from our idealization of an 
infinitely sudden change in momentum. We expect our formula to be valid 
only for radiation whose frequency is less than the reciprocal of the scattering 
time. For a relativistic electron, another possible cutoff would take effect when 
individual photons carry away a sizable fraction of the electron’s energy. In 
either case our formula is valid in the low-frequency limit, provided that we 
cut off the integral at some maximum frequency A:, II}IX . We then have 

OL 

Energy = - • A; max • I(v, v'). (6.16) 

7r 


The integrand of X(v, v') peaks when k is parallel to either v or v': 


In the extreme relativistic limit, most of the radiated energy comes from 
the two peaks in the first term of (6.15). Let us evaluate X(v, v') in this limit, 
by concentrating on the regions around these peaks. Break up the integral 
into a piece for each peak, and let 0 = 0 along the peak in each case. Integrate 
over a small region around 9 = 0, as follows: 


cos 6—1 


X(v,v')« j 


d cos 9 


1 - v • v' 


(1 — v cos#)(l — v • v') 


A:-V—v' -v 


cos 6=1 


+ 


/ 


dcos9 


1 - v • v' 


(1 — v • v')(l — v' cos 9)' 


k-w'—w '-v 


(The lower limits on the integrals are not critical; an equally good choice 
would be k ■ v = 1 — ,t( 1 — v • v'), as long as x is neither too close to 0 nor 
too much bigger than 1. It is then easy to show that the leading term in the 
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relativistic limit does not depend on x.) The integrals are easy to perforin, 
and we obtain 


(^Wt^t)=* ( 


l-v'-v' 


' (E 2 - p-p') 2 
E 2 (E — p) 2 




p-p 




(6.17) 


,(E 2 -p 2 )/2, 
where q 2 = { p 1 — p) 2 . 

In conclusion, we have found that the radiated energy at low frequencies 
is given by 


Energy = — f dkT.(v,v) — » — [ dk logf—— V (6.18) 

7T J E^>m 7 T J \ III- ) 

0 0 

If this energy is made up of photons, each photon contributes energy k. We 
would then expect 


fOmax 

a f 1 

Number of photons = — / dk — X(v, v'). (6.19) 

it J k 

o 

We hope that a quantum-mechanical calculation will confirm this result. 


Quantum Computation 

Consider now the quantum-mechanical process in which one photon is radiated 
during the scattering of an electron: 


Let Mo denote the part of the amplitude that comes from the electron’s 
interaction with the external field. Then the amplitude for the whole process 
is 


iM = -ieu(p') [Mo{p',P ~ k) l ^_ 

+ vcw + Mo{p ' + *>?)) «(*>)■ 


( 6 . 20 ) 


Since we are interested in connecting with the classical limit, assume that 
the photon radiated is soft: |k| <C |p' — p|. Then we can approximate 


M 0 (p',p -k) « M 0 (p' + k,p) M 0 {p',p), 


( 6 . 21 ) 
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and we can ignore 1/i in the numerators of the propagators. The numerators 
can be further simplified with some Dirac algebra. In the first term we have 

(y + m) 7 M e* u(p) = [2 p"e* + 7%(-y + rn)] u(p ) 

= 2p"e* u(p). 

Similarly, in the second term, 

«(/'') 7%(/ + w) = u(p') 2p' IJ e* 1 . 

The denominators of the propagators also simplify: 

(p — k) 2 — m 2 = — 2p ■ k ; (p' + fc) 2 — m 2 = 2p' • k. 


So in the soft-photon approximation, the amplitude becomes 


iM = u(p')[M 0 (p' ,p)\u(p) 


( P 1 ■ c* _ P • c* \ 
\ p' • k p ■ k / 


( 6 . 22 ) 


This is just the amplitude for elastic scattering (without bremsstralilung), 
times a factor (in brackets) for the emission of the photon. 

The cross section for our process is also easy to express in terms of the 
elastic cross section; just insert an additional phase-space integration for the 
photon variable k. Summing over the two photon polarization states, we have 


d.a[p —1 p' + 7 ) = da(p —> p') • j 


f d3k 1 Y" c 2 

p'-e (A) p-e w 

/ ( 2 tt ) 3 2k x ^ 2 

p' -k p-k 


(6.23) 


Thus the differential probability of radiating a photon with momentum k , 
given that the electron scatters from p to p', is 


d(prob) 


d 3 k 




p 

p • k 


) 


2 


(6.24) 


This looks very familiar; if we multiply by the photon energy k to compute 
the expected energy radiated, we recover the classical expression ( 6 . 12 ). 

But there is a problem. Equation (6.24) is an expression not for the ex¬ 
pected number of photons radiated, but for the probability of radiating a 
single photon. The problem becomes worse if we integrate over the photon 
momentum. As in (6.16), we can integrate only up to the energy at which our 
soft-photon approximations break down; a reasonable estimate for this energy 
is |q| = |p — p'|. The integral is therefore 


Total probability 


M 

7T / dk 


(6.25) 


Since I(v,v') is independent of k, the integral diverges at its lower limit 
(where all our approximations are well justified). In other words, the total 
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probability of radiating a very soft photon is infinite. This is the famous 
problem of infrared divergences in QED perturbation theory. 

We can artificially make the integral in (6.25) well-defined by pretending 
that the photon has a very small mass p. This mass would then provide a 
lower cutoff for the integral, allowing us to write the result of this section as 

2 

dcr(p p' + j(k)) = d<j(p —> p') • ^log(—)l(v,v') 

'' „ „ (6-26) 

~ da(p -s-p') ■ -log(—^-) log(—^r)- 
-q 2 ^>oo 7T V jl~ ) \ m- ) 

The q 2 dependence of this result, known as the Sudakov double logarithm , is 
physical and will appear again in Section 6.4. The dependence on p, however, 
presents a problem that we must solve. It is not hard to guess that the resolu¬ 
tion of this problem will involve reinterpreting (6.24) as the expected number 
of radiated photons, rather than the probability of radiating a single pho¬ 
ton. We will see in Sections 6.4 and 6.5 how this reinterpretation follows from 
the Feynman diagrams. To prepare for that discussion, however, we need to 
improve our understanding of the amplitude for scattering without radiation. 


6.2 The Electron Vertex Function: Formal Structure 

Having briefly discussed QED radiative corrections due to emission of photons 
(bremsstrahlung), let us now study the correction to electron scattering that 
comes from the presence of an additional virtual photon: 


(6.27) 


This will be our first experience with a Feynman diagram containing a loop. 
Such diagrams give rise to significant and profound complications in quantum 
field theory. 

The result of computing this diagram will be rather complicated, so it 
will be useful to think ahead about what form we expect this correction to 
take and how to interpret its various possible terms. In this section, we will 
consider the general properties of vertex correction diagrams. We will see that 
the basic requirements of Lorentz invariance, the discrete symmetries of QED, 
and the Ward identity strongly constrain the form of the vertex. 
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Consider, then, the class of diagrams 


where the gray circle indicates the sum of the lowest-order electron-photon 
vertex and all amputated loop corrections. We will call this sum of vertex 
diagrams —ieT p (p',p). Then, according to our master formula (4.103) for 5- 
matrix elements, the amplitude for electron scattering from a heavy target 
is 

IM = ie 2 (u(p') T ll (p',p) u(p ) j ( 'u{k')^^u(k ) j. (6.28) 

More generally, the function T p (p',p) appears in the 5-matrix element 
for the scattering of an electron from an external electromagnetic field. As in 
Problem 4.4, add to the Hamiltonian of QED the interaction 

A H int = J'd 3 xeA^f, (6.29) 

where j p {x) = ip(x}'y p ip{x) is the electromagnetic current and A c * is a fixed 
classical potential. In the leading order of perturbation theory, the 5-matrix 
element for scattering from this field is 

iM (27 r)S(p°' — p°) = —ieu(p') / y f ‘u(p) ■ A^(p' —p), 

where A^(g) is the Fourier transform of A^(x). The vertex corrections modify 
this expression to 

iM (27t )S(p 01 — p°) = —ieu(p') T A< (p',p) u(p) ■ A c J(p' — p). (6.30) 

In writing (6.28) and (6.30), we have deliberately omitted the contribution of 
vacuum polarization diagrams, such as the fourth diagram of (6.1). The reason 
for this omission is that these diagrams should be considered corrections to 
the electromagnetic field itself, while the diagrams included in T M represent 
corrections to the electron’s response to a given applied field.* 

We can use general arguments to restrict the form of P' (p 1 , p). To lowest 
order, T M = 7 ;t . In general, T M is some expression that involves p, p', ■y ,i , 
and constants such as m, e, and pure numbers. This list is exhaustive, since 
no other objects appear in the Feynman rules for evaluating the diagrams 
that contribute to F'. The only other object that could appear in any theory 
is e t “ /pa (or equivalently, y 5 ); but this is forbidden in any parity-conserving 
theory. 

iTo justify this statement, we must give a careful definition of an applied external 
field in a quantum field theory. We will do this in Chapter 11. 
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We can narrow down the form of considerably by appealing to Lorentz 
invariance. Since T M transforms as a vector (in the same sense that 7 '' does), 
it must be a linear combination of the vectors from the list above: 7 ^, p 1 ', and 
p n ‘. Using the combinations p' + p and p' — p for convenience, we have 

r" = 7 m - A + (p'v+p^-B + (p't' -p^-C. (6.31) 


The coefficients A, B, and C could involve Dirac matrices dotted into vectors, 
that is, y or y'. But since ylu(p) = m-u(p) and u(p')]P = u(p')-m , we can 
write the coefficients in terms of ordinary numbers without loss of generality. 
The only nontrivial scalar available is q 2 = —2 p'-p + 2to 2 , so A, B, and C 
must be functions only of q 2 (and of constants such as m). 

The list of allowed vectors can be further shortened by applying the Ward 
identity (5.79): P' = 0. (Note that our arguments for this identity in Sec¬ 
tion 5.5—and the proof in Section 7.4—do not require q 2 = 0.) Dotting q M 
into (6.31), we find that the second term vanishes, as does the first when sand¬ 
wiched between u(p') and u(p). The third term does not automatically vanish, 
so C must be zero. 

We can make no further simplifications of (6.31) on general principles. It 
is conventional, however, to rewrite (6.31) by means of the Gordon identity 
(see Problem 3.2): 


uip'Yfuip) 


Up') 


p'» + p 1 ' ia^q v 


2 TO 


+ 


2m 


i{p). 


(6.32) 


This identity allows us to swap the (p 1 + p) term for one involving a^ v q v . We 
write our final result as 

n ri 

V‘(P',P) = YF\{q 2 ) + — — — —F- 2 {q 2 ), (6.33) 

2m 

where F\ and To are unknown functions of q 2 called form factors. 

To lowest order, F\ = 1 and To = 0. In the next section we will compute 
the one-loop (order-a) corrections to the form factors, due to the vertex cor¬ 
rection diagram (6.27). In principle, the form factors can be computed to any 
order in perturbation theory. 

Since F\ and To contain complete information about the influence of 
an electromagnetic field on the electron, they should, in particular, contain 
the electron’s gross electric and magnetic couplings. To identify the electric 
charge of the electron, we can use (6.30) to compute the amplitude for elastic 
Coulomb scattering of a nonrelativistic electron from a region of nonzero elec¬ 
trostatic potential. Set A^(x) = (<p(x), 0 ). Then A^(q) = ((27r)S(q°)(p(q),0). 
Inserting this into (6.30), we find 

iM = -ieu(p') T°(p',p) u(p) ■ <p{ q). 


If the electrostatic field is very slowly varying over a large (perhaps macro¬ 
scopic) region, (j>( q) will be concentrated about q = 0; then we can take the 
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limit q —»■ 0 in the spinor matrix element. Only the form factor F\ contributes. 
Using the nonrelativistic limit of the spinors, 

u(p')j°u(p) = %Y(p')u(p) 2m£'*£, 

the amplitude for electron scattering from an electric field takes the form 

iM = —ieF 1 (0)^(q) • 2(6.34) 


This is the Born approximation for scattering from a potential 

V (x) = eUi(0)^(x). 


Thus iq(0) is the electric charge of the electron, in units of e. Since Fi(0) = 1 
already in the leading order of perturbation theory, radiative corrections to 
F\(q 2 ) should vanish at q 2 = 0. 

By repeating this analysis for an electron scattering from a static vector 
potential, we can derive a similar connection between the form factors and the 
electron’s magnetic moment.* Set A^(x) = (0,A cl (x)). Then the amplitude 
for scattering from this field is 

iM = +ie | u(p') (y .Fi + ^-^Fo)u(p)] A’^q). (6.35) 

The expression in brackets vanishes at q = 0, so we must carefully extract from 
it a contribution linear in q\ To do this, insert the nonrelativistic expansion 
of the spinors u(p), keeping terms through first order in momenta: 


u(p) 


\fv~o A 

Vp ■ 


(1 — p cr/2m)£ 
(1 + p-cr/2m)£ i 


Then the F\ term can be simplified as follows: 


(6.36) 


u(p')Yu(p) = 2m + cr, ^7')^ 

Applying the identity <r®<r J = + ie’-> k (T k , we find a spin-independent term, 

proportional to (p' +p), and a spin-dependent term, proportional to (p' — p). 
The first of these terms is the contribution of the operator [p • A + A • p] in 
the standard kinetic energy term of nonrelativistic quantum mechanics. The 
second is the magnetic moment interaction we are seeking. Retaining only the 
latter term, we have 


u(p'Wu(p ) = 2m&(rrl -<#y<T*V. 

V 2m ) 

The F -2 term already contains an explicit factor of q, so we can evaluate it 
using the leading-order term of the expansion of the spinors. This gives 

uip^^a^q^juip) = 2mi' f (— 


*The following argument contains numerous factors of (— 1 ) from raising and 
lowering spacelike indices. Be careful in verifying tlie algebra. 
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Thus, the complete term linear in qi in the electron-photon vertex function is 

u{p')(l i F 1 + )«(P) 2mp(^t i * k jo k [Fi(0) +F 2 (0)])^ 

Inserting this expression into (6.35), we find 

iM = -i(2m) • ^(^^[^(0) + ft(P)])ZB k (q), 

where 

B k ( q) = 

is the Fourier transform of the magnetic field produced by A cI (x). 

Again we can interpret M as the Born approximation to the scattering 
of the electron from a potential well. The potential is just that of a magnetic 
moment interaction, 

F(x) = - (n) ■ B(x), 

where 

< M ) = -[Fi(0) + F 2 (0)]^t|e 
m 2 

This expression for the magnetic moment of the electron can be rewritten in 
the standard form 

' t=9 (i) S ' 

where S is the electron spin. The coefficient g, called the Lande g-factor, is 

<7 = 2[Fi(0) + F 2 (0)] =2 + 2Fo(0). (6.37) 

Since the leading order of perturbation theory gives no F 2 term, QED predicts 
g = 2 + O(a). The leading term is the standard prediction of the Dirac 
equation. In higher orders, however, we will find a nonzero F 2 and thus a small 
difference between the electron’s magnetic moment and the Dirac value. We 
will compute the order-ct contribution to this anomalous magnetic moment 
in the next section. 

Since our derivation of the structure (6.33) for the vertex function used 
only general symmetry principles, we expect this formula to apply not only 
to the electron but to any fermion with electromagnetic interactions. For ex¬ 
ample, the electromagnetic scattering amplitude of the proton should also be 
described by two invariant functions of q 2 . Since the proton is not an ele¬ 
mentary particle, we should not expect the Dirac equation values F\ = 1 and 
F -2 = 0 to be good approximations to the form factors of the proton. In fact, 
both proton form factors depend strongly on q 2 . However, the description of 
the vertex function in term of form factors provides a useful summary of data 
on scattering at many energies and angles. The precise transcription between 
form factors and cross sections is worked out in Problem 6.1. In addition, the 
general constraints at q 2 = 0 that we have just derived apply to the proton: 
iq(0) = 1, and 2F 2 (0) = ( g p — 2), though the g-factor of the proton differs by 
40% from the Dirac value. 
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6.3 The Electron Vertex Function: Evaluation 

Now that we know what form the answer is to take (Eq. (6.33)), we are ready 
to evaluate the one-loop contribution to the electron vertex function. Assign 
momenta on the diagram as follows: 


Applying the Feynman rules, we find, to order a, that P' = y 1 ' + dP‘, where 


u(p')6T p (p' ,p)u(p) 
d 4 k 


/ 


l (!vp 


(27t) 4 (k—p) 2 +ie 


u(p')(-ie 7 ") 


+ m) 


k r2 —m' 2 +ie ^ k 2 —m 2 +ie 


1 + m) 


{—■iej p )u(p) 


t' d 4 k u{p') [# 7 + m 2 j 1 ' - 2 m(k + k') p ]u{p) 

J (27t) 4 ((k — p) 2 + ie)(k' 2 — m 2 + ie)(k 2 — m 2 + ie)' 


(6.38) 


In the second line we have used the contraction identity 

Note that the +ie terms in the denominators cannot be dropped; they are 

necessary for proper evaluation of the loop-momentum integral. 

The integral looks impossible, and in fact it will not be easy. The eval¬ 
uation of such integrals requires another piece of computational technology, 
known as the method of Feynman parameters (although a very similar method 
was introduced earlier by Schwinger). 


Feynman Parameters 

The goal of this method is to squeeze the three denominator factors of (6.38) 
into a single quadratic polynomial in k , raised to the third power. We can then 
shift k by a constant to complete the square in this polynomial and evaluate 
the remaining spherically symmetric integral without difficulty. The price will 
be the introduction of auxiliary parameters to be integrated over. 

It is easiest to begin with the simpler case of two factors in the denomi¬ 
nator. We would then use the identity 

1 1 

is = / m+( L )B) » = J dxdil S(x+S - 1} mw 


(6.39) 
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An example of its use might look like this: 

i 

1 r 1 

--——-— = / dxdyS(x+y- 1) --r 

(k—p)-(k-—m-) J [x(k-p) 2 +y{k 2 -m 2 )Y 

1 

= / dxdyS(x+y- 1 ) —---—y 

•1 [k 2 — 2xk-p+xp 2 —ym~\ 

If we now let l = k — xp, we see that the denominator depends only on £ 2 . 
Integrating over d 4 k would now be much easier, since d 4 k = d 4 l and the 
integrand is spherically symmetric with respect to l. The variables x and y 
that make this transformation possible are called Feynman parameters. 

Our integral (6.38) involves a denominator with three factors, so we need 
a slightly better identity. By differentiating (6.39) with respect to B, it is easy 
to prove 

—j— = [ dxd y S(x+y-l) T — ny " 1 —-. (6.40) 

AB n J y v y J [xA + yB] n + 1 V ; 

But this still isn’t quite good enough. The formula we need is 


1 

A\ An ■ ■ ■ A n 



■dx n S(J2xi-l) 


{n — 1 )! 

\xiA\ +x 2 A 2 H- x n A n ] n 


(6.41) 


The proof of this identity is by induction. The case n = 2 is just Eq. (6.39); 
the induction step is not difficult and involves the use of (6.40). 

By repeated differentiation of (6.41), you can derive the even more general 
identity 


1 

j nn j m 2 Am, 
-ft-l -^2 * ’ * 



■ d,x n 5{J2xi-l) 


[ZxtAif"" 


T(mi H-ht«„) 

T(toi) • • • T(m n ) 


(6.42) 

This formula is true even when the m* are not integers; in Section 10.5 we 
will apply it in such a case. 


Evaluation of the Form Factors 

Now let us apply formula (6.41) to the denominator of (6.38): 

i 

(( t- i ,) 5+if )( t .,-,l, 5+if )(^-„C+i f ) = J dzdsdzHx+y+z-l) A 

0 
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where the new denominator D is 

D = x(k 2 — to 2 ) + y(k' 2 — to 2 ) + z(k — p) 2 + (x + y + z)ie 

( 6 - 43 ) 

= k 2 + 2k- (yq — zp ) + yq 2 + zp 2 — (x + y)m 2 + ie. 

In the second line we have used x + y + z = 1 and k' = k + q. Now shift k to 
complete the square: 

£ = k + yq - zp. 

After a bit of algebra we find that D simplifies to 

D = f - A + ie, 


where 

A = —xyq 2 + (1 — z) 2 m 2 . (6.44) 


Since q 2 < 0 for a scattering process, A is positive; we can think of it as an 
effective mass term. 

Next we must express the numerator of (6.38) in terms of £. This task is 
simplified by noting that since D depends only on the magnitude of l. 


d 4 £ 1* 


(27r) 4 D 3 
d 4 £ £»r 


= 0 ; 


(2?r ) 4 D 3 


/ 


d 4 £ y iv £ 2 


(27r) 4 D 3 


(6.45) 

(6.46) 


The first identity follows from symmetry. To prove the second, note that the 
integral vanishes by symmetry unless p = v. Lorentz invariance therefore 
requires that we get something proportional to g liv . To check the coefficient, 
contract each side with g llM . Using these identities, we have 


Numerator = u(p') ^ 7 '' ^ + m 2 — 2 m(k + k') ,J j u(p) 

u(p') [ 4'"f 2 + (—Iti + zf£)Y'((l - y)tf+ zf£) 

+ to 2 — 2 to((1 — 2 y)q fl + 2zp f, )^u(p). 


(Remember that k' = k + q.) 

Putting the numerator into a useful form is now just a matter of some 
tedious Dirac algebra (about a page or two). This is where our work in the 
last section pays off, since it tells us what kind of an answer to expect. We 
eventually want to group everything into two terms, proportional to 7 '' and 
icr ,J ‘'q l The most straightforward way to accomplish this is to aim instead for 
an expression of the form 


Y-A + (Y+p^-B + q>‘-C, 


just as in (6.31). Attaining this form requires only the anticommutation rela¬ 
tions (for example, # 7 ^ = 2 —Ylf) an d the Dirac equation (rfu(p) = mu(p ) 
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Figure 6.1. The contour of the £° integration can be rotated as shown. 

and u(p')/ = u(p') m; note that this implies u(p')tfu(p) = 0). It is also useful 
to remember that x + y + z = 1. When the smoke clears, we have 

Numerator = u(p) |qT • ( — \l 2 + (1— ;r)(l— y)q 2 + (1—2 z—z 2 )m 2 ) 

+ (p'^+P 11 ) • hi z (z 11 + q l ‘ ■m{z—‘2)(x—y)^u(p). 

The coefficient of q M must vanish according to the Ward identity, as discussed 
after Eq. (6.31). To see that it does, note from (6.44) that the denominator 
is symmetric under x y. The coefficient of q 11 is odd under x y and 
therefore vanishes when integrated over x and y. 

Still following our work in the previous section, we now use the Gordon 
identity (6.32) to eliminate (p 1 + p) in favor of ia^q v . Our entire expression 
for the O(a) contribution to the electron vertex then becomes 

^ l 

u(p')6T»{p' ,p)u(p) = 2 ie 2 J jdxdydz6(x+y+z-1 ) 

0 

x u(p') |^ 7 # ' • (-hf 2 + (l-:c)(l -y)q 2 + (1-4 z+z 2 )m 2 ) 

+ l(J . 2 m'' { 2m ' 2 z(l-z))^u(p), (6.47) 

where as before, 

D = l 2 — A + ie, A = —xyq 2 + (l-^) 2 m 2 > 0. 

The decomposition into form factors is now manifest. 

With most of the work behind us, our main remaining task is to perform 
the momentum integral. It is not difficult to evaluate the i 0 integral as a 
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contour integral, then do the spatial integrals in spherical coordinates. We 
will use an even easier method, making use of a trick called Wick rotation. 
Note that if it were not for the minus signs in the Minkowski metric, we could 
perform the entire four-dimensional integral in four-dimensional “spherical” 
coordinates. To remove the minus signs, consider the contour of integration 
in the £°-plane (see Fig. 6.1). The locations of the poles, and the fact that 
the integrand falls off sufficiently rapidly at large |£°|, allow us to rotate the 
contour counterclockwise by 90°. We then define a Euclideaji 4-momentum 
variable £e- 

P=iP E ; £=£e■ (6.48) 

Our rotated contour goes from P E = — oo to oo. By simply changing vari¬ 
ables to ( E , we can now evaluate the integral in four-dimensional spherical 
coordinates. 

Let us first evaluate 


d 4 £ 


(2tt) 4 [P - A ] 1 


- —tt——— ry / d l l E 

(-1)™ (2tt) 4 J 

w / da ‘ J d,E 


[£% + A] m 

£ 3 e 

[4 + *\ m 


(Here we need only the case m = 3, but the more general result will be useful 
for other loop calculations.) The factor f d$U is the surface “area” of a four¬ 
dimensional unit sphere, which happens to equal 2n' 2 . (One way to compute 
this area is to use four-dimensional spherical coordinates, 


x = (r sin w sin# cos 0 , r sin u sin# sin 0, r sin tacos#, r costa). 


The integration measure is then d 4 x = r 3 sin 2 co sin 6 dcp cl6 dw dr.) The rest of 
the integral is straightforward, and we have 


d 4 l 


( 2 tt ) 4 [f 



i{- l) m 1 1 

( 47 t ) 2 (m— l)(m—2) A m_2 


(6.49) 


Similarly, 


d 4 l 


(2 t r) 4 [P 



j(_l)m-i 2 1 

( 47 t ) 2 (m— l)(m—2)(m—3) A m_3 


(6.50) 


Note that this second result is valid only when m > 3. When m = 3, the Wick 
rotation cannot be justified, and the integral is in any event divergent. But it 
is just this case that we need for (6.47). 

We will eventually explore the physical meaning of this divergence, but 
for the moment we simply introduce an artificial prescription to make our 
integral finite. Go back to the original expression for the Feynman integral in 
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(6.38), and replace in the photon propagator 

1 1 1 

(k—p) 2 + ie * (k—p) 2 + ie (k—p) 2 — A 2 + ie' 


(6.51) 


where A is a very large mass. The integrand is unaffected for small k (since 
A is large), but cuts off smoothly when k > A. We can think of the second 
term as the propagator of a fictitious heavy photon, whose contribution is 
subtracted from that of the ordinary photon. In terms involving the heavy 
photon, the numerator algebra is unchanged and the denominator is altered 

by 

A —» A a = —xyq 2 + (1 - z) 2 m 2 + zA 2 . (6.52) 


The integral (6.50) is then replaced with a convergent integral, which can be 
Wick-rotated and evaluated: 


r d A e 

( f e \ 

i 

fdi% 

( 4 4 \ 

1 (27r) 4 

U^ 2 - A ] 3 [£ 2 -A A ] 3 J 

(47t) 2 J 

U4+A] 3 [4+Aa] 3 ; 


= (i)y log (w) (6 - 53) 

The convergent terms in (6.47) are modified by terms of order A -2 , which we 
ignore. 

This prescription for rendering Feynman integrals finite by introducing 
fictitious heavy particles is known as Pauli- Villars regularization. Please note 
that the fictitious photon has no physical significance, and that this method 
is only one of many for defining the divergent integrals. (We will discuss other 
methods in the next chapter; see especially Problem 7.2.) We must hope that 
the new parameter A will not appear in our final results for observable cross 
sections. 

Using formulae (6.49) and (6.53) to evaluate the integrals in (6.47), we 
obtain an explicit, though complicated, expression for the one-loop vertex 
correction: 


a f 

=—— dx dy dz S(x+y+z—l) 

2t:J 

o 

x u(p’) (V [log ^ ^ ((l-'c)(l ~y)q 2 + (1—4z+ 2 2 )t?i 2 ^ J 


+ 


ia^q, \ 1 
2m 


—2m 2 z{l—z) j u{p). (6.54) 


The bracketed expressions are our desired corrections to the form factors. 
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Before we try to interpret this result, let us summarize the calculational 
methods we used. The techniques are common to all loop calculations: 

1. Draw the diagram(s) and write down the amplitude. 

2. Introduce Feynman parameters to combine the denominators of the prop¬ 
agators. 

3. Complete the square in the new denominator by shifting to a new loop 
momentum variable, l. 

4. Write the numerator in terms of l. Drop odd powers of (, and rewrite 
even powers using identities like (6.46). 

5. Perform the momentum integral by means of a Wick rotation and four¬ 
dimensional spherical coordinates. 

The momentum integral in the last step will often be divergent. In that case 
we must define (or regularize) the integral using the Pauli-Villars prescription 
or some other device. 

Now that we have parametrized the ultraviolet divergence in (6.54), let 
us try to interpret it. Notice that the divergence appears in the worst possible 
place: It corrects Fi(q' 2 = 0), which should (according to our discussion at the 
end of the previous section) be fixed at the value 1. But this is the only effect 
of the divergent term. We will therefore adopt a simple but completely ad hoc 
fix for this difficulty: Subtract from the above expression a term proportional 
to the zeroth-order vertex function {u{p')j^u{p)), in such a way as to maintain 
the condition Fj(0) = 1. In other words, make the substitution 

SFiiq 2 ) -+ SF 1 (q 2 ) - SF 1 (0) (6.55) 

(where SF i denotes the first-order correction to Fi). The justification of this 
procedure involves the minor correction to our 5-matrix formula (4.103) men¬ 
tioned in Section 4.6. In brief, the term we are subtracting corrects for our 
omission of the external leg correction diagrams of (6.1). We postpone the 
justification of this statement until Section 7.2. 

There is also an infrared divergence in Fi(q 2 ), coming from the 1/A term. 
For example, at q 2 = 0 this term is 

1 1 1 -: 

-2 + (l-z)(3-z) 
m 2 (l—z) 2 

o oo 

l 

f —2 

= dz ——-- + finite terms. 

J m-(l—z) 

We can cure this disease by pretending that the photon has a small nonzero 
mass p. Then in the denominator of the photon propagator, (k — p) 2 would 
become (k — p) 2 — p 2 . This denominator was multiplied by 2 in (6.43), so the 
net effect is to add a term zp 2 to A. We will discuss the infrared divergence 
further in the next two sections. 


JtedyizHz+y+z- 1 ) 


1—42 + 2 - 
A(q 2 =0) 


/• 


= I dz 



196 


Chapter 6 Radiative Corrections: Introduction 


With both of these provisional modifications, the form factors are 

i 


Fi(q 2 ) = 1 + — / dx dy dz 5{x+y+z— 1) 
‘2irJ 


log( 


m 2 (l— z)' 2 


m' 2 (l—z)' 2 — q 2 xy 


+ 


m' 2 (l—4z+z 2 ) + q 2 (l—x)(l—y) 


m 2 (l—z) 2 — q 2 xy + fi 2 z 


m 2 (l—4z+z 2 ) 


+ 0(a 2 % 


i(q 2 ) = — dx dy dz 8(x+y+z— 1) 
2tt J 


m 2 (l—z) 2 + p 2 z\ 

'2m 2 z(l—z) 


(6.56) 


[m 2 (l—z) 2 —q 2 xy\ 


+ 0(a 2 ). (6.57) 


Note that neither the ultraviolet nor the infrared divergence affects Fiiq 2 ). 
We can therefore evaluate unambiguously 


^ 1 n 2m 2 z(l - ?) 

-FMr = 0 ) = — / dx dy dz S(x + y + z - 1) ——-r^- 

27 tJ m-(l — z)- 

0 

1 1-5 



0 0 

Thus, we get a correction to the ^-factor of the electron: 

a e = J = — ~ .0011614. 

2 2tt 


(6.58) 


(6.59) 


This result was first obtained by Schwinger in 1948.* Experiments give a e = 
.0011597. Apparently, the unambiguous value that we obtained for F 2 (0) is 
also, up to higher orders in a, unambiguously correct. 


Precision Tests of QED 

Building on the success of the order-a QED prediction for a e , successive gener¬ 
ations of physicists have improved the accuracy of both the theoretical and the 
experimental determination of this quantity. The coefficients of the QED for¬ 
mula for a e are now known through order a 4 . The calculation of the order-ar 
and higher coefficients requires a systematic treatment of ultraviolet diver¬ 
gences. 

These challenging theoretical calculations have been matched by increas¬ 
ingly imaginative experiments. The most recent measurement of a e uses a 
technique, developed by Dehmelt and collaborators, in which individual elec¬ 
trons are trapped in a system of electrostatic and magnetostatic fields and 


*J. Schwinger, Phvs. Rev. 73, 416L (1948). 
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excited to a spin resonance.t Today, the best theoretical and experimental 
values of a e agree to eight significant figures. 

High-order QED calculations have also been carried out for several other 
quantities. These include transition energies in hydrogen and hydrogen-like 
atoms, the anomalous magnetic moment of the muon, and the decay rates of 
singlet and triplet positronium. Many of these quantities have also been mea¬ 
sured to high precision. The full set of these comparisons gives a detailed test 
of the validity of QED in a variety of settings. The results of these precision 
tests are summarized in Table 6.1. 

There is some subtlety in reporting the results of precision comparisons 
between QED theory and experiment, since theoretical predictions require an 
extremely precise value of a , which can only be obtained from another pre¬ 
cision QED experiment. We therefore quote each comparison between theory 
and experiment as an independent determination of a. Each value of a is as¬ 
signed an error that is the composite of the expected uncertainties from theory 
and experiment. QED is confirmed to the extent that the values of a from 
different sources agree. 

The first nine entries in Table 6.1 refer to QED calculations in atomic 
physics settings. Of these, the hydrogen hyperfine splitting, measured using 
Ramsey’s hydrogen maser, is the most precisely known quantity in physics. 
Unfortunately, the influence of the internal structure of the proton leads to un¬ 
certainties that limit the accuracy with which this quantity can be predicted 
theoretically. The same difficulty applies to the Lamb shift, the splitting be¬ 
tween the j = 1/2 ‘2S and 2 P levels of hydrogen. The most accurate QED 
tests now come from systems that involve no strongly interacting particles, 
the electron g— 2 and the hyperfine splitting in the e~ g + atom, muonium. The 
last entry in this group gives a new method for determining a, by convert¬ 
ing a very accurate measurement of the neutron Compton wavelength, using 
accurately known mass ratios, to a value of the electron mass. This can be 
combined with the known value of the Rydberg energy and accurate QED 
formulae to determine a. The only serious discrepancy among these numbers 
comes in the triplet positronium decay rate; however, there is some evidence 
that diagrams of relative order a 2 give a large correction to the value quoted 
in the table. 

The next two entries are determinations of a from higher-order QED re¬ 
actions at high-energy electron colliders. These high-energy experiments typi¬ 
cally achieve only percent-level accuracy, but their results are consistent with 
the precise information available at lower energies. 

Finally, the last two entries in the table give two independent measure¬ 
ments of a from exotic quantum interference phenomena in condensed-matter 
systems. These two effects provide a standard resistance and a standard fre¬ 
quency, respectively, which are believed to measure the charge of the electron 


lR. Van Dyck, Jr., P. Schwinberg, and H. Dehmelt, Phvs. Rev. Lett. 59, 26 (1987). 
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Table 6.1. Values of a 1 Obtained from Precision QED Experiments 


Low-Energy QED: 


Electron (g — 2) 

Muon (g — 2) 

Muonium hyperfine splitting 

Lamb shift 

Hydrogen hyperfine splitting 

2 3 Si-1 3 Si splitting in positronium 
: So positronium decay rate 

3 Si positronium decay rate 

Neutron compton wavelength 

137.035 992 35 (73) 
137.035 5 (1 1) 
137.035 994 (18) 
137.036 8 (7) 
137.036 0 (3) 
137.034 (16) 

137.00 (6) 

136.971 (6) 

137.036 010 1 (5 4) 

High-Energy QED: 


o{e + e~ -1 e+e _ e + e“) 
cr(e + e~ —1 e + e~ /j. + /j~) 

136.5 (2.7) 

139.9 (1.2) 

Condensed Matter: 


Quantum Hall effect 

AC Josephson effect 

137.035 997 9 (3 2) 
137.035 977 0 (7 7) 


Each value of a displayed in this table is obtained by fitting an experimental 
measurement to a theoretical expression that contains a as a parameter. The 
numbers in parentheses are the standard errors in the last displayed digits, 
including both theoretical and experimental uncertainties. This table is based 
on results presented in the survey of precision QED of Kinosliita (1990). That 
book contains a series of lucid reviews of the remarkable theoretical and ex¬ 
perimental technology that has been developed for the detailed analysis of 
QED processes. The five most accurate values are updated as given by T. Ki- 
noshita in History of Original Ideas and Basic Discoveries in Particle Physics, 

H. Newman and T. Ypsilantis, eds. (Plenum Press, New York, 1995). This 
latter paper also gives an interesting perspective on the future of precision 
QED experiments. 

with corrections that are strictly zero for macroscropic systems.* 

The entire picture fits together well beyond any reasonable expectation. 
On the evidence presented in this table, QED is the most stringently tested— 
and the most dramatically successful—of all physical theories. 


+For a discussion of these effects, and their exact relation to a, see D. R. Yennie, 
Rev. Mod. Phys. 59, 781 (1987). 
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Now let us confront the infrared divergence in our result (6.56) for Fi(q' 2 ). 
The dominant part, in the y —> 0 limit, is 


Fi(q 2 ) 


f L^i( X+y+z -i) 

2 IT J 
0 


m' 2 (l—4:Z+z' 2 ) + q 2 (l—x)(l—y) 
m 2 (l—z) 2 — q 2 xy + y 2 z 


m' 2 (l—4:Z+z 2 ) 
m 2 (l—z) 2 + fjflz 


(6.60) 


To understand this expression we must do some work to simplify it, extracting 
and evaluating the divergent part of the integral. Throughout this section we 
will retain only terms that diverge in the limit y , —» 0. 

First note that the divergence occurs in the corner of Feynman-parameter 
space where z k 1 (and therefore x ~ y ~ 0). In this region we can set z = 1 
and x = y = 0 in the numerators of (6.60). We can also set z = 1 in the y 2 
terms in the denominators. Using the delta function to evaluate the x-integral, 
we then have 


Fi(q 2 ) 



—2m 2 + q 2 

m 2 (l—z) 2 — q 2 y(l—z—y) + y 2 


—2m 2 

///-(1 z) 2 + y 2 


(The lower limit on the ^-integral is unimportant.) Making the variable 
changes 

y = (l-.z)^, w = {l-z), 


this expression becomes 


Fi(q 2 ) 


a 

2tt 


i i 


J<% \Jd(w 2 ) 

0 0 



— 2m 2 + q 2 
q 2 £( l-^)]u’ 2 + y 2 


—2m 2 
m 2 w 2 + y? 



—2m 2 + q 2 
m 2 — q 2 £(l—0 



q 2 t( 1-0 




)— (f) 


In the limit y —f 0 we can ignore the details of the numerators inside the 
logarithms; anything proportional to m 2 or q 2 is effectively the same. We 
therefore write 


F 1 ^) = l-^W)log(^^) + 0(a a ) I 


y 

where the coefficient of the divergent logarithm is 


1 

/m(r) = / (; 


m 2 — q 2 / 2 

F -q 2 £( 1-0 




- 1. 


(6.61) 


(6.62) 



200 


Chapter 6 Radiative Corrections: Introduction 


Since q 2 is negative and £(1—£) has a maximum value of 1/4, the first term 
is greater than 1 and hence fm(q 2 ) is positive. 

How does this infinite term affect the cross section for electron scattering 
off a potential? Since Fi(q 2 ) is just the quantity that multiplies in the 
matrix element, we can find the new cross section by making the replacement 
e —> e • Fi(q' 2 ). The cross section for the process p —> p' is therefore 

sHst). «U») 

where the first factor is the tree-level result. Note that the O(a) correction 
to the cross section is not only infinite, but negative. Something is terribly 
wrong. 

To gain a better understanding of the divergence, let us evaluate the 
coefficient of the divergent logarithm, fm(q 2 ), in the limit — q 2 —»■ oo. In this 
limit, we find a second logarithm: 


1 

/ 




_I 111 _, i -<? 

—g 2 £(l—£) + m 2 2 J —q 2 £ + m 2 

o 


+ 


equal contribution 


from £ : 


ution \ 

1 ) 


h£y 


The form factor in this limit is therefore 


Fl (-r -><*,) = ! - ■»«( 


(6.64) 


(6.65) 


Note that the numerator in the second logarithm is — q 2 , not m 2 ; this expres¬ 
sion contains not only the correct coefficient of log(l//r), but also the correct 
coefficient of log 2 (g 2 ). 

The same double logarithm of — q 2 appeared in the cross section for soft 
bremsstrahlung, Eq. (6.26). This correspondence points to a resolution of the 
infrared divergence problem. Comparing (6.65) with (6.26), we find in the 
limit —q 2 —> oo 


da .. / da \ r a 

dn {p ^ p ^ = VdnJ 0 r n 


da' 


da . . (da \ \ a 

dn ip ^ p +7) = (dn)ol 


da' 


los (t^) log (~y") +c?<a ')]i 

log (t^) log (il“) +0( “ 3 )] ■ 


( 6 . 66 ) 


The separate cross sections are divergent, but their sum is independent of p, 
and therefore finite. 

In fact, neither the elastic cross section nor the soft bremsstrahlung cross 
section can be measured individually; only their sum is physically observable. 
In any real experiment, a photon detector can detect photons only down to 
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some minimum limiting energy E(. The probability that a scattering event 
occurs and this detector does not see a photon is the sum 


da 


da 


— (p -4- p') + — (p -4- p' + 7 (k < E ( )) = 




dfl 


V /measured 


(6.67) 


The divergent part of this “measured” cross section is 


/R as 

(R ‘ 

V /measured 

Vdfl/o . 


o [i-^/m(r)iog(- 


9 2 

-q or m 




) 


+ 


a 

2tt 


/(v.v')log^) +e>(a 2 )]. 


We have just seen that I(v,v') = 2/i R (g 2 ) when — g 2 '» to 2 . If the same 
relation holds for general q 2 , the measured cross section becomes 


(—) » 

(R ‘ 

V c/O / measured 

\ dfl Jo . 


2 2 

-q or m 


E 2 


)+0(a a )], (6.68) 


which depends on the experimental conditions, but no longer on p 1 . The 
infrared divergences from soft bremsstrahlung and from Fi(q 2 ) cancel each 
other, yielding a finite cross section for the quantity that can actually be 
measured. 

We must still verify the identity I(v,v') = '2fm(q 2 ) for arbitrary values 
of q 2 . From (6.13) we have 


r( v,v')= f$±( . ' 2p - p : 

J 4tt • p')(k •' 


m 


m 


K (k • p')(k • p ) (k • p') 2 (k • p) 2 

The last two terms are easy to evaluate: 

i 

r* 

dcosO 


(6.69) 
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(p 0 —pcos6) 2 P 2 TO 2 ' 


In the first term, we can combine the denominators with a Feynman parameter 
and perform the integral in the same way: 

l 


/ 
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dflji 


4tt (k ■ p')(k ■ p) 


= R 


/■ 
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= R 


dflji 

4^ [^.p , + (i-e*-p] : 

1 

1 


R + (WR 


/• 


= R 


TO 2 — £(1—£)g 2 


(In the last step we have used 2p ■ p 1 = 2 to 2 — g 2 .) Putting all the terms of 
(6.69) together, we find 


Rv') = 


2 to 2 — g 2 


to 2 - £(l-£)g : 


-)c^-2 = 2/ IR (g 2 ), (6.70) 
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just what we need to cancel the infrared divergence. 

Although Eq. (6.68) demonstrates the cancellation of the infrared diver¬ 
gence, this result has little practical use. An experimentalist would want to 
know the precise dependence on q 2 , which we did not evaluate carefully. Re¬ 
call from (6.65), however, that we were careful to obtain the correct coefficient 
of log 2 (— q 2 ) in the limit —q 2 m 2 . In that limit, therefore, (6.68) becomes 


da \ 

dPl / measured 



l0 «'(tJrM-4d + 0 < ai >]- 


(6.71) 


This result is unambiguous and useful. Note that the O(a) correction again 
involves the Sudakov double logarithm. 


6.5 Summation and Interpretation 
of Infrared Divergences 


The discussion of infrared divergences in the previous section suffices for re¬ 
moving the infinities from our bremsstrahlung and vertex-correction calcula¬ 
tions. There are still, however, three points that we have not addressed: 

1. We have not demonstrated the cancellation of infrared divergences beyond 
the leading order. 

2. The correction to the measured cross section that we found after the 
infrared cancellation (Eqs. (6.68) and (6.71)) can be made arbitrarily 
negative by making photon detectors with a sufficiently low threshold E(. 

3. We have not yet reproduced the classical result (6.19) for the number of 
photons radiated during a collision. 

The solutions of the second and third problems will follow immediately from 
that of the first, to which we now turn. 

A complete treatment of infrared divergences to all orders is beyond the 
scope of this book.* We will discuss here only the terms with the largest 
logarithmic enhancement at each order of perturbation theory. In general, 
these terms are of order 


a 

n 




(6.72) 


in the nth order of perturbation theory. Our final physical conclusions were 
first presented by Bloch and Nordsieck in a prescient paper written before the 
invention of relativistic perturbation theory.! We will follow a modern, and 
simplified, version of the analysis due to Weinberg.! 


*Tlie definitive treatment is given in D. Yennie, S. Frautschi, and H. Suura, Ann. 
Phys. 13 , 379 (1961). 

If. Bloch and A. Nordsieck, Phys. Rev. 52 , 54 (1937). 

+ S. Weinberg, Phys. Rev. 140 , B516 (1965). 
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Infrared divergences arise from photons with “soft” momenta: real pho¬ 
tons with energy less than some cutoff E(, and virtual photons with (after 
Wick rotation) k 2 < E 2 . A typical higher-order diagram will involve numer¬ 
ous real and virtual photons. But to find a divergence, we need more than 
a soft photon; we need a singular denominator in an electron propagator. 
Consider, for example, the following two diagrams: 


The first diagram, in which the electron emits a soft photon followed by a 
hard photon, has no infrared divergence, since the momenta in both electron 
propagators are far from the mass shell. If the soft photon is emitted last, 
however, the denominator of the adjacent propagator is ( p' + k) 2 — mr = 
2 p 1 ■ k, which vanishes as k —> 0. Thus the second diagram does contain a 
divergence. We would like, then, to consider diagrams in which an arbitrary 
hard process, possibly involving emission of hard and soft photons, is modified 
by the addition of soft real and virtual photons on the electron legs: 


Following Weinberg, we will add up the contributions of all such diagrams. 
The only new difficulty in this calculation will be in the combinatorics of 
counting all the ways in which the photons can appear. 

First consider the outgoing electron line: 
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We attach n photons to the line, with momenta k\ ... k n . For the moment we 
do not care whether these are external photons, virtual photons connected to 
each other, or virtual photons connected to vertices on the incoming electron 
line. The Dirac structure of this diagram is 




(6.73) 


We will assume that all the kj are small, dropping the 0(k' 2 ) terms in the 
denominators. We will also drop the ^ terms in the numerators, just as in 
our treatment of bremsstrahlung in Section 6.1. Also, as we did there, we can 
push the factors of (/ + m) to the left and use u(p')(—/ + m) = 0: 


u(p') 7 Ml + m) 7 ,<2 {/ + m) • • • = u(p') 2y M2 (/ +m) ■ ■ ■ 

= u(p') 2 p'^ 2 p 1 11 ' 1 - 


this turns expression (6.73) 
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M‘2 


■ki) \ p-(k i + ko) J \p'-(k i H-b k„) 


n'»n 


(6.74) 


Still working with only the outgoing electron line, we must now sum over 
all possible orderings of the momenta k\ ... k n . (This procedure will overcount 
when two of the photons are attached together to form a single virtual photon. 
We will deal with this overcounting later.) There are n! different diagrams to 
sum, corresponding to the nl permutations of the n photon momenta. Let n 
denote one such permutation, so that n(i) is the number between 1 and n that 

1 is taken to. (For example, if ir denotes the permutation that takes 1 —> 3, 

2 —> 1 and 3—^2, then tt(1) = 3, 7t(2) = 1, and tt(3) = 2.) 

Armed with this notation, we can perform the sum over permutations by 
means of the following identity: 


E 

all permu¬ 
tations 7T 


l l 

P'k 7r(l) P'{kn(l) + ^7t(2)) 


1 

P ’ (kn( 1) + k w (2) "b * * * "b ^7(7))) 


1 1 1 

p-h p-ko p ■ k n 


(6.75) 


The proof of this formula proceeds by induction on n. For n = 2 we have 
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i i 

P’ kn(l) P (^7r(l) +^tt{2)) 


11 11 

p-k\ p-{ki+k-z) p-kn p-{k- 2 +k\) 

1 1 

p ■ kl p ■ kn 
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For the induction step, notice that the last factor on the left-hand side of 
(6.75) is the same for every permutation tt. Pulling this factor outside the 
sum, the left-hand side becomes 


LHS 


1 

V T. k 


E 


i i 

P'kir(l) P-(K(1)+K(2)) 


1 

P' (^7r(l) + • • • + ^7r(n—1)) 


For any given 7r, the quantity being summed is independent of k n ( n y Letting 
i = 7r (n), we can now write 


n 

E = EE- 

7r i =1 n'(i) 


where tt '(*) is the set of all permutations on the remaining n — 1 integers. 
Assuming by induction that (6.75) is true for n — 1, we have 


LHS = 


1 y 
p-^ k h 


i i 

p- kip-k-2 


1 1 

p ■ k; i p ■ k i+ 1 


1 

P ■ k n ' 


If we now multiply and divide each term in this sum by p-kj , we easily obtain 
our desired result (6.75). 

Applying (6.75) to (6.74), we find 



(6.76) 


where the blob denotes a sum over all possible orders of inserting the n photon 
lines. 

A similar set of manipulations simplifies the sum over soft photon inser¬ 
tions on the initial electron line. There, however, the propagator momenta are 
p — k\, p — k\ — ko, and so on: 


We therefore get an extra minus sign in the factor for each photon, since 
(p — 5/c) 2 — m 2 —2 p ■ Y>k. 
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Now consider diagrams containing a total of n soft photons, connected in 
any possible order to the initial or final electron lines. The sum over all such 
diagrams can be written 


= u(p') iM hard u(p) 


( p'" 1 

P 1X1 \ 

/ //"'• 

p K2 . 

V p' ■ ki 

p-fa) 

6 \p' • k-2 

P ■ k-i) 


p'Un p nn . 

P 1 ■ k n p ■ k n ) ' 


(6.77) 


By multiplying out all the factors, you can see that we get the correct term 
for each possible way of dividing the n photons between the two lines. 

Next we must decide which photons are real and which are virtual. 

We can make a virtual photon by picking two photon momenta fa, and kj , 
setting kj = —kj = k, multiplying by the photon propagator, and integrating 
over k. For each virtual photon we then obtain the expression 


r d'k -i 

( p' P \ 

1 • I 

( P' P \ 

1 (27 r) 4 k 2 + ie 

K p' ■ k p • k) 

1 

1 

1 


(6.78) 


The factor of 1/2 is required because our procedure has counted each Feynman 
diagram twice: interchanging kj and kj gives back the same diagram. It is 
possible to evaluate this expression by careful contour integration, but there 
is an easier way. Notice that this approximation scheme assigns to the diagram 
with one loop and no external photons the value 




Thus, X must be precisely the infrared limit of the one-loop correction to the 
form factor, as displayed in (6.61): 

X = -^/m(r)log(^-). (6.79) 

A direct derivation of this result from (6.78) is given in Weinberg’s paper cited 
above. Note that result (6.79) followed in our argument of the previous sec¬ 
tion only after the subtraction at q 2 = 0, and so we should worry whether 
(6.79) is consistent with the corresponding subtraction of the nth-order dia¬ 
gram. In addition, some of the diagrams we are summing contain external-leg 
corrections, which we have not discussed. Here we simply remark that nei¬ 
ther of these subtleties affects the final answer; the proof requires the heavy 
machinery in the paper of Yennie, Frautschi, and Suura. 

If there are m virtual photons we get m factors like (6.79), and also an 
additional symmetry factor of 1/m! since interchanging virtual photons with 
each other does not change the diagram. We can then sum over m to obtain 
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the complete correction due to the presence of arbitrarily many soft virtual 
photons: 


°°^ 

x ^ = u(p')(iMhaxd)u(p)exp(X). (6.80) 

m =0 


If in addition to the m virtual photons we also emit a real photon, we must 
multiply by its polarization vector, sum over polarizations, and integrate the 
squared matrix element over the photon’s phase space. This gives an additional 
factor 


r d 3 k 1 
J (27r) 3 2k 



P» \/ v w 

p • k) \ p' ■ k 



(6.81) 


in the cross section. Assuming that the energy of the photon is greater than 
/i and less than Et (the detector threshold), this expression is simply 


Y = Sx ( v,v')log(^) = ^/m(r)log(l|). (6.82) 

If n real photons are emitted we get n such factors, and also a symmetry 
factor of 1 /n\ since there are n identical bosons in the final state. The cross 
section for emission of any number of soft photons is therefore 


n =0 


(6.83) 

Combining our results for virtual and real photons gives our final result 
for the measured cross section, to all orders in a, for the process p —> p' + 
(any number of photons with k < E(): 


= (Si x exp(2X) x exp(Y) 

= (So x exp [ _ 7 A,i(,3)log (S)l x exp [? /m(<,S)1 ° 6 (f-)] 

= (S „ * «p[-fW)i»e(^-)]- ( 6 . 81 ) 


The correction factor depends on the detector sensitivity Et, but is indepen¬ 
dent of the infrared cutoff p. Note that if we expand this result to O(a), 
we recover our earlier result (6.68). Now, however, the correction factor is 
controlled in magnitude—always between 0 and 1. 

In the limit — q' 2 3> to 2 , our result becomes 
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(6.85) 
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In this limit, the probability of scattering without emitting a hard photon 
decreases faster than any power of q 2 . The exponential correction factor, con¬ 
taining the Sudakov double logarithm, is known as the Sudakov form factor. 

To conclude this section, let us calculate the probability, in the same ap¬ 
proximation, that some hard scattering process is accompanied by the produc¬ 
tion of n soft photons, all with energies between £?_ and E + . The phase-space 
integral for these photons gives log {E+/EJ) instead of log(£y/p). If we as¬ 
sign photons with energy greater than E + to the “hard” part of the process, 
we find that the cross section is given by (6.84), times the additional factor 


Probfny with E_<E<E + ) 


i[g W)log @r 

■ [-fW)iog(ff)]. 


x exp 


( 6 . 86 ) 


This expression has the form of a Poisson distribution, 

P(n) = -U n e-\ 
n! 

with 

A = (n) = ^log(^ ± )T( v , v '), 

This is precisely the semiclassical estimate of the number of radiated photons 
that we made in Eq. (6.19). 


Problems 

6.1 Rosenbluth formula. As discussed Section 6.2, the exact electromagnetic in¬ 
teraction vertex for a Dirac fermion can be written quite generally in terms of two 
form factors Fi(q 2 ) and Fo(g 2 ): 


= u(P ') l^Fliq 2 ) + F-2(q 2 ) u(p), 


where q = p 1 —p and a 1 * 1 ' = ^*[ 7 ^, 7 ^]. If the fermion is a strongly interacting particle 
such as the proton, the form factors reflect the structure that results from the strong 
interactions and so are not easy to compute from first principles. However, these form 
factors can be determined experimentally. Consider the scattering of an electron with 
energy E m e from a proton initially at rest. Show that the above expression for 
the vertex leads to the following expression (the Rosenbluth formula) for the elastic 
scattering cross section, computed to leading order in a but to all orders in the strong 
interactions: 

da _ 7r “ 2 [ (i? f - cos2 I - 2^( F i + ) 2 sin 2 |] 


d cos 6 


2^ 2 [l + f sin 2 f] sin 4 f 
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where 9 is the lab-frame scattering angle and F\ and Fo are to be evaluated at the 
q 2 associated with elastic scattering at this angle. By measuring (da/d cos 6) as a 
function of angle, it is thus possible to extract F\ and Fo . Note that when F% c= 1 and 
F -2 = 0, the Rosenblutli formula reduces to the Mott formula (in the massless limit) 
for scattering off a point particle (see Problem 5.1). 

6.2 Equivalent photon approximation. Consider the process in which electrons 
of very high energy scatter from a target. In leading order in a , the electron is connected 
to the target by one photon propagator. If the initial and final energies of the electron 
are E and E 1 , the photon will carry momentum q such that q 2 sa —2EE'(1 — cos 9). 
In the limit of forward scattering, whatever the energy loss, the photon momentum 
approaches q 2 = 0; thus the reaction is highly peaked in the forward direction. It is 
tempting to guess that, in this limit, the virtual photon becomes a real photon. Let us 
investigate in what sense that is true. 

(a) The matrix element for the scattering process can be written as 

M = (-ie)u(p' )'y , 'u(p) — (q), 

where M. v represents the (in general, complicated) coupling of the virtual photon 
to the target. Let us analyze the structure of the piece u(p' )~f ,l u(p). Let q = 
(q°, q), and define q = ( q° , — q). We can expand the spinor product as: 

u(p')^u(p) = A- q" +B- <f + C ■ ej* + D ■ ef), 

where ,4, B, C, D are functions of the scattering angle and energy loss and e,; 
are two unit vectors transverse to q. By dotting this expression with show 
that the coefficient B is at most of order 0 2 . This will mean that we can ignore 
it in the rest of the analysis. The coefficient A is large, but it is also irrelevant, 
since, by the Ward identity, q^M.^ = 0. 

(b) Working in the frame where p = (E, 0, 0, E), compute explicitly 

u(p ')7 ' eMp) 

using massless electrons, u(p) and u(p') spinors of definite helicity, and ei, eo 
unit vectors parallel and perpendicular to the plane of scattering. We need this 
quantity only for scattering near the forward direction, and we need only the 
term of order 9. Note, however, that for e in the plane of scattering, the small 3 
component of e also gives a term of order 9 which must be taken into account. 

(c) Now write the expression for the electron scattering cross section, in terms of 
\M.v\ 2 and the integral over phase space on the target side. This expression 
must be integrated over the final electron momentum p' . The integral over p' 3 
is an integral over the energy loss of the electron. Show that the integral over 
p' ± diverges logarithmically as p 1 L or 9 —> 0. 

(d) The divergence as 0 —> 0 appears because we have ignored the electron mass in 
too many places. Show that reintroducing the electron mass in the expression 
for q 2 , 

q 2 = -2 (EE 1 - pp cos 9) + 2m 2 , 

cuts off the divergence and yields a factor of log (s/m 2 ) in its place. 
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(e) Assembling all the factors, and assuming that the target cross sections are inde¬ 
pendent of the photon polarization, show that the largest part of the electron- 
target scattering cross section is given by considering the electron to be the 
source of a beam of real photons with energy distribution (x = E^/E): 

N~ f (x)dx = [1 + (1 - .r) 2 ] log ) ' 

This is the Weizsacker-Williams equivalent photon approximation. This phe¬ 
nomenon allows us, for example, to study plioton-plioton scattering using e + e“ 
collisions. Notice that the distribution we have found here is the same one that 
appeared in Problem 5.5 when we considered soft photon emission before elec¬ 
tron scattering. It should be clear that a parallel general derivation can be con¬ 
structed for that case. 


6.3 Exotic contributions to g — 2. Any particle that couples to the electron can 
produce a correction to the electron-photon form factors and, in particular, a correction 
to g — 2. Because the electron g — 2 agrees with QED to high accuracy, these corrections 
allow us to constrain the properties of hypothetical new particles. 

(a) The unified theory of weak and electromagnetic interactions contains a scalar 
particle h called the Higgs boson , which couples to the electron according to 


/« 


A , - 


Hint = I d 3 x —= h !/’</’■ 

V2 


Compute the contribution of a virtual Higgs boson to the electron (g — 2), in 
terms of A and the mass mg of the Higgs boson. 

(b) QED accounts extremely well for the electron’s anomalous magnetic moment. If 
a = (g-2)/2, 

C.expi . — ttQED | < 1 X 10 


What limits does this place on A and mg? In the simplest version of the elec- 
troweak theory, A = 3 x 10 -C and mg > 60 GeV. Show that these values are 
not excluded. The coupling of the Higgs boson to the muon is larger by a fac¬ 
tor (mfj/m e ): A = 6 x 10 -4 . Thus, although our experimental knowledge of the 
muon anomalous magnetic moment is not as precise, 

| a expt. — °QED | < 3 x 10 , 


one can still obtain a stronger limit on mg. Is it strong enough? 

(c) Some more complex versions of this theory contain a pseudoscalar particle called 
the axion , which couples to the electron according to 


Hint = 


f , 3 iX 

J a/2 


a 07° i/>. 


The axion may be as light as the electron, or lighter, and may couple more 
strongly than the Higgs boson. Compute the contribution of a virtual axion to 
the g — 2 of the electron, and work out the excluded values of A and m a - 
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Radiative Corrections: 
Some Formal Developments 


We cheated four times in the last three chapters,* stating (and sometimes 
motivating) a result but postponing its proof. Those results were: 

1. The formula for decay rates in terms of 5-matrix elements, Eq. (4.86). 

2. The master formula for 5-matrix elements in terms of Feynman diagrams, 
Eq. (4.103). 

3. The Ward identity, Eq. (5.79). 

4. The ad hoc subtraction to remove the ultraviolet divergence in the vertex- 
correction diagram, Eq. (6.55). 

It is time now to return to these issues and give them a proper treatment. In 
Sections 7.2 through 7.4 we will derive all four of these results. The knowledge 
we gain along the way will help us interpret the three remaining loop correc¬ 
tions for electron scattering from a heavy target shown in (6.1): the external 
leg corrections and the vacuum polarization. We will evaluate the former in 
Section 7.1 and the latter in Section 7.5. 

This chapter will be more abstract than the two preceding ones. Its main 
theme will be the singularities of Feynman diagrams viewed as analytic func¬ 
tions of their external momenta. We will find, however, that this apparently 
esoteric subject is rich in physical implications, and that it illuminates the rela¬ 
tion between Feynman diagrams and the general principles of quantum theory. 


7.1 Field-Strength Renormalization 

In this section we will investigate the analytic structure of the two-point cor¬ 
relation function, 

<n| T<t>(x)m |fl) or <fi| mx)^(y) |fi). 

In a free field theory, the two-point function (0| T<f>(x)<f>{y) |0) has a simple 
interpretation: It is the amplitude for a particle to propagate from y to x. To 
what extent does this interpretation carry over into an interacting theory? 


*A fifth cheat, postulating rather than deriving the photon propagator, will be 
remedied in Chapter 9. 
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Our analysis of the two-point function will rely only on general principles 
of relativity and quantum mechanics; it will not depend on the nature of 
the interactions or on an expansion in perturbation theory. We will, however, 
restrict our consideration to scalar fields. Similar results can be obtained for 
correlation functions of fields with spin; we will display the analogous result 
for Dirac fields at the end of the analysis. 

To dissect the two-point function (fi| T0(x)0(y) |S1) we will insert the 
identity operator, in the form of a sum over a complete set of states, between 
0(x) and We choose these states to be eigenstates of the full interacting 
Hamiltonian, H. Since the momentum operator P commutes with H, we 
can also choose the states to be eigenstates of P. But we can also make 
a stronger use of Lorentz invariance. Let |Ao) be an eigenstate of H with 
momentum zero: P |Ao) = 0. Then all the boosts of |Ao) are also eigenstates 
of H, and these have all possible 3-momenta. Conversely, any eigenstate of H 
with definite momentum can be written as the boost of some zero-momentum 
eigenstate | Ao) - The eigenvalues of the 4-momentum operator = (H, P) 
organize themselves into hyperboloids, as shown in Fig. 7.1. 

Recall from Chapter 2 that the completeness relation for the one-particle 
states is 


( 1 )l-particle = / (t)3 2ig p Ip) (p| ' (71) 

We can write an analogous completeness relation for the entire Hilbert space 
with the aid of a bit of notation. Let |A P ) be the boost of |Ao) with momen¬ 
tum p, and assume that the states |A P ), like the one-particle states |p), are 
relativistically normalized. Let E v ( A) = y/|p | 2 + where m\ is the “mass” 
of the states |A P ), that is, the energy of the state |Ao)- Then the desired com¬ 
pleteness relation is 


1^><0| + £/|^|A p ><A p |, (7,2) 

where the sum runs over all zero-momentum states |Ao) - 

We now insert this expansion between the operators in the two-point 
function. Assume for now that ;c° > y°. Let us drop the uninteresting constant 
term (fl| 0(x) |fl) (fi| 0{y) |fi). (This term is usually zero by symmetry; for 
higher-spin fields, it is zero by Lorentz invariance.) The two-point function is 
then 


(0| <j>{x)0{y) |0) 


E 


7 d? p 1 

J (2tt) 3 2£p(A) 


<fi| 0(x) |A P ) <A P | 0(y) |Q). (7.3) 



7.1 Field-Strength Renormalization 213 


Figure 7.1. The eigenvalues of the 4-momentum operator Pf* t= (iff, P) oc¬ 
cupy a set of hyperboloids in energy-momentum space. For a typical theory 
the states consist of one or more particles of mass m. Thus there is a hyper¬ 
boloid of one-particle states and a continuum of hyperboloids of two-particle 
states, tliree-particle states, and so on. There may also be one or more bound- 
state hyperboloids below the threshold for creation of two free particles. 

We can manipulate the matrix elements as follows: 

(n\ o(.r) |A P ) = <S> « ;, - o(lih |A P ) 

(1> o(0) A p >, ; , 0 . 7/ , p (7 .4) 

= <fi|^(0)|Ao)e-^| p0=£p . 

The last equality is a result of the Lorentz invariance of (fl\ and d>(0): Insert 
factors of U~ X U, where U is the unitary operator that implements a Lorentz 
boost from p to 0, and use ff<p(0)U _1 = <p( 0). (For a field with spin, we would 
need to keep track of its nontrivial Lorentz transformation.) Introducing an 
integration over p °, our expression for the two-point function (still for x° > y° ) 
becomes 

{Q\4>(x)4>(y)\Q) = I -» Vw e-^-«)|<n|^(0)|Ao)| 3 . (7.5) 

(27r) 4 p“— m^+ie 1 1 

Note the appearance of the Feynman propagator, Dp(x — y), but with m 
replaced by m\. 
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Figure 7.2. The spectral function p(M 2 ) for a typical interacting field the¬ 
ory. The one-particle states contribute a delta function at m 2 (the square of 
the particle’s mass). Multiparticle states have a continuous spectrum begin¬ 
ning at (2m) 2 . There may also be bound states. 

Analogous expressions hold for the case y° > x°. Both cases can be sum¬ 
marized in the following general representation of the two-point function (the 
Kalien-Lehmann spectral representation ): 

OO 

<fi| T4>(x)4>(y) 1^) J p(M 2 ) D f (x - y- M 2 ), (7.6) 

0 

where p(M 2 ) is a positive spectral density function, 

p(M 2 ) = t)S(M 2 - ml) <S> m |A 0 )| 2 . (7.7) 

A 

The spectral density p(M 2 ) for a typical theory is plotted in Fig. 7.2. 
Note that the one-particle states contribute an isolated delta function to the 
spectral density: 

p(M 2 ) = 2ir5(At 2 — m 2 ) ■ Z + (nothing else until M' 2 ^ (2m) 2 ), (7.8) 

where Z is some number given by the squared matrix element in (7.7). We 
refer to Z as the field-strength renormalization. The quantity m is the exact 
mass of a single particle—the exact energy eigenvalue at rest. This quantity 
will in general differ from the value of the mass parameter that appears in the 
Lagrangian. We will refer to the parameter in the Lagrangian as mo, the bare 
mass, and refer to m as the physical mass of the boson. Only the physical 
mass m is directly observable. 

The spectral decomposition (7.6) yields the following form for the Fourier 
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Figure 7.3. Analytic structure in the complex p 2 -plane of the Fourier trans¬ 
form of the two-point function for a typical theory. The one-particle states 
contribute an isolated pole at the square of the particle mass. States of two 
or more free particles give a branch cut, while bound states give additional 
poles. 

transform of the two-point function: 

OO 

/ *,«** <S!| 0) m-f%- 0<" ! ) yzswii 

(7.9) 

CO v ' 

iz r , 

p 2 —m' 2 +ie **" ./ 2ir p 2 —M 2 +ie' 

~4m 2 

The analytic structure of this function in the complex p 2 -plane is shown in 
Fig. 7.3. The first term gives an isolated simple pole at p 2 = m 2 , while the 
second term contributes a branch cut beginning at p 2 = {2m) 2 . If there are 
any two-particle bound states, these will appear as additional delta functions 
in p(M 2 ) and thus as additional poles below the cut. 

In Section 2.4, we found an explicit result for the two-point correlation 
function in the theory of a free scalar field: 

/ d 4 xe ip - x <0| T<t>{x)<t>( 0) |0) = 0 \ . . (7.10) 

J p-—m“+ie 

We interpreted this formula, for ;r° > 0, as the amplitude for a particle to 
propagate from 0 to x. Equation (7.9) shows that the two-point function 
takes a similar form in the most general theory of an interacting scalar field. 
The general expression is essentially a sum of scalar propagation amplitudes 
for states created from the vacuum by the field operator cj>( 0). There are 
two differences between (7.9) and (7.10). First, Eq. (7.9) contains the field- 
strength renormalization factor Z = | (Ao| <^(0) |fi) | 2 , the probability for ^(0) 
to create a given state from the vacuum. In (7.10), this factor is included 
implicitly, since (p\ <b(0) |0) = 1 in free field theory. Second, Eq. (7.9) contains 
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contributions from multiparticle intermediate states with a continuous mass 
spectrum. In free field theory, d>(0) can create only a single particle from the 
vacuum. With these two differences, (7.9) is a direct generalization of (7.10). 

It will be important in our later analysis that the contributions to (7.9) 
from one-particle and multiparticle intermediate states can be distinguished 
by the strength of their analytic singularities. The poles in p 2 come only from 
one-particle intermediate states, while multiparticle intermediate states give 
weaker branch cut singularities. We will see in the next section that this rather 
formal observation generalizes to higher-point correlation functions and plays 
a crucial role in our derivation of the diagrammatic formula for 5-matrix 
elements. 

The analysis of this section generalizes directly to two-point functions of 
higher-spin fields. The main complication comes in the generalization of the 
manipulation (7.4), since now the field has a nontrivial transformation law 
under boosts. In general, several invariant spectral functions are required to 
represent the multiparticle states. But this complication does not affect the 
major result that a pole in p 2 can arise only from the contribution of a single¬ 
particle state created by the field operator. The two-point function of Dirac 
fields, for example, has the structure 


/ 


d l xe ip - x (fl| Tip{x)ip{G) |n) = 


iZ -2 Y.s u s (p)u s (p ) 
p 2 — m 2 + ie 

iZ-ii'jf + m) 


+ 


(7.11) 


+ ■ 


p 2 — m 2 + ie 

where the omitted terms give the multiparticle branch cut. As in the scalar 
case, the constant Z- 2 is the probability for the quantum field to create or 
annihilate an exact one-particle eigenstate of H : 

W(0)| p,s) = y%u s (p). (7.12) 

(For an antiparticle, replace u with 7.) At the pole, the Dirac two-point func¬ 
tion is exactly that of a free field with the physical mass, aside from the 
rescaling factor Z 2 . 


An Example: The Electron Self-Energy 

This nonperturbative analysis of the two-point correlation function has been 
very different from our usual direct analysis of Feynman diagrams. But since 
this derivation was done in complete generality, the singularity structure of 
the two-point function that it implies ought also to be visible in a Feynman 
diagram computation. In the rest of this section we will explicitly check our 
results for the electron two-point function in QED. 

The electron two-point function is equal to the sum of diagrams 


(n\T#(x)ti,(v)\n) = 


(7.13) 
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Each of these diagrams, according to the Feynman rules for correlation func¬ 
tions, contains a factor of f or this two external points and an inte¬ 

gration f{d 4 p/( 27t) 4 ) over the momentum p carried by the initial and final 
propagators. We will consistently omit these factors in this section; in other 
words, each diagram will denote the corresponding term in the Fourier trans¬ 
form of the two-point function. 

The first diagram is just the free-field propagator: 

= , i( ' +m<l> . . (7.14) 

p- — mg -f le 

Throughout this calculation, we will write mo instead of m as the mass in the 
electron propagator. This makes explicit the fact noted above that the mass 
appearing in the Lagrangian differs, in general, from the observable rest energy 
of a particle. However, if a perturbation expansion is applicable, the leading- 
order expression for the propagator should approximate the exact expression. 
Indeed, the function (7.14) has a pole, of just the form of (7.11), at jr = m^. 
We therefore expect that the complete expression for the two-point function 
also has a pole of this form, at a slightly shifted location rnr = thq + O (a). 

The second diagram in (7.13), called the electron self-energy , is somewhat 
more complicated: 


ijrf+mo) 

p 2 — niQ 


HSo(p)] 


i(y + m 0 ) 

9 9 5 

P z — TUq 


(7.15) 


where 


-*S 2 (p) 



i(tf + m 0 ) 

To 2 I 

Ar — ttIq + ie 


—i 

( p—k ) 2 — pi 2 + ie 


(7.16) 


(The notation S 2 indicates that this is the second-order (in e) contribution to 
a quantity F that we will define below.) The integral Fo has an infrared 
divergence, which we have regularized by adding a small photon mass p.. 
Outside this integral, the diagram seems to have a double pole at p 2 = m^. 
All in all, the form of this correction is quite unpleasant. But let us press on 
and try to evaluate F 2 (p) using the calculational techniques developed for the 
vertex correction in the Section 6.3. 

First introduce a Feynman parameter to combine the two denominators: 


1 1 

k 2 —niQ-\-ie (p—k) 2 —p 2 +ie 


l 

/ d’ x ---j 

[k 2 —2xk •p+xp 2 —xp 2 — (l—x)mQ+ie] “ 
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Now complete the square and define a shifted momentum i = 
the term linear in l from the numerator, we have 

1 


-*So(p) 


—e 



d 4 l —2xrf+4mo 
\P - A + it] 2 ’ 


o 


k — xp. Dropping 


(7.17) 


where A = — x(l — x)p 2 + xp 2 + (1— x)m^. The integral over l is divergent. To 
evaluate it, we first regulate it using the Pauli-Villars procedure (6.51): 

1 1 1 

( p—k ) 2 — p 2 + it ( p—k ) 2 — p 2 + it (p—k) 2 — A 2 + it 

The second term will have the same form as (7.17), but with p replaced by A. 
As in Section 6.3, we now Wick-rotate and substitute the Euclidean variable 
Pg = —i£°. This gives 


where 




Aa = —x(l—x)p 2 + .tA 2 + (1— x)rriQ — > xA 2 . 

A—yoc 


(7.18) 


The final result is therefore 

l 

s 2 (p) = ^~ [dx (2m 0 - xjf) log(- - 2 ~ cA 0 ---T-w) ■ ( 719 ) 

27T J \ (1— x)mQ + xp- — x(l—x)p- ) 

o 

Before discussing the divergences in this expression, let us work out its 
analytic behavior as a function of p 2 . The logarithm in (7.19) has a branch 
cut when its argument becomes negative, and for any fixed x this will occur 
for sufficiently large p 2 . More exactly, the cut begins at the point where 

(1 —x)m,Q + xp 2 — x(l—x)p 2 = 0. 


Solving this equation for x, we find 

= 1 d 2 . / (p 2 + fflp — p 2 ) 2 ml 

X 2 2 p 2 2 p 2 Y 4 p 4 p 2 

= l + ^ “ 2P 2 ± W~ {mo+ ^ 2 ] ^ “ (m °-^) 2 ]' (7 ' 2 °) 

The branch cut of S 2 (p 2 ) begins at the minimum value of p 2 such that this 
equation has a real solution for x between 0 and 1. This occurs when p 2 = 
(mo + pi) 2 , that is, at the threshold for creation of a two particle (electron 
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plus photon) state. In fact, it is a simple exercise in relativistic kinematics to 
show that the square root in (7.20), written in the form 


k = 


T^ \J[P 2 ~ ( m o + A*) 2 ] [p 2 ~ (mo ~ n)' 2 ] , 




is precisely the momentum in the center-of-mass frame for two particles of 
mass too and p and total energy \fj?. It is natural that this momentum be¬ 
comes real at the two-particle threshold. The location of the branch cut is 
exactly where we would expect from the Kallen-Lehmann spectral represen¬ 
tation.! 

We have now located the two-particle branch cut predicted by the Kallen- 
Lehmann representation, but we have not found the expected simple pole at 
p 2 = m 2 . To find it we must actually include an infinite series of Feynman 
diagrams. Fortunately, this series will be easily summed. 

Let us define a one-particle irreducible (1PI) diagram to be any diagram 
that cannot be split in two by removing a single line: 


Let —iTj(p) denote the sum of all 1PI diagrams with two external fermion 
lines: 


(7.21) 


(Although each diagram has two external lines, the Feynman propagators for 
these lines are not to be included in the expression for E(p).) To leading order 
in a we see that E = E 2 . 

The Fourier transform of the two-point function can now be written as 
Jd 4 x (n\Ti/j{x)^(0) |fi) e ip ' x = 


i(tf+mo) i(tf+mo), 

9 9 ' 9 9 \ 12-j) 9 o 

p z — TTIq p z — TTIq p z — TTIq 


(7.22) 


lln real QED, p = 0 and tlie two-particle branch cut merges with the one-particle 
pole. This subtlety plays a role in the full treatment of the cancellation of infrared 
divergences, but it is beyond the scope of our present analysis. 
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The first diagram has a simple pole at p 2 = thq. Each diagram in the second 
class has a double pole at p 2 = TOq. Each diagram in the third class has a 
triple pole. The behavior near p 2 = m'^ gets worse and worse as we include 
more and more diagrams. But fortunately, the sum of all the diagrams forms 
a geometric series. Note that E(p) commutes with y, since E(p) is a function 
only of pure numbers and y. In fact, we can consider E(p) to be a function 
of y, writing jr = (y) 2 . Then we can rewrite each electron propagator as 
i/(y— mo) and express the above series as 

f d A x {n\Ti/j{x)^{0) |n) e* p '* 


*1*1 

( S(jO n 

11 1 1 


y — mo y — mo 

^ y — mo / 

y- Too 

K y — mo ) 


-y-mo-E(y)' (7 ' 23) 

The full propagator has a simple pole, which is shifted away from mo by E(y). 

The location of this pole, the physical mass m, is the solution of the 
equation 

[y-mo-S(y)] |^ m =0. (7.24) 

Notice that, if E(y) is defined by the convention (7.21), then a positive con¬ 
tribution to E yields a positive shift of the electron mass. Close to the pole, 
the denominator of (7.23) has the form 


(y—m) 


' — m) 2 ). 


(7.25) 


Thus the full electron propagator has a single-particle pole of just the form 
(7.11), with m given by (7.24) and 


V = 1 - 


Our explicit calculation of Eo allows us to compute the first corrections 
to m and Z 2 . Let us begin with m. To order a, the mass shift is 

Sm = m — mo = E 2 (y = m) « Eo(y= mo). (7.27) 

Thus, using (7.19), 

i 


OL f 

Sm = —mo / dx (2 — x) log 
2 7T J 


xA' 2 

{l—x) 2 ml + xp 2 


(7.28) 


The mass shift is ultraviolet divergent; the divergent term has the form 

Sm t— m 0 log ( —o-l • (' 

A—foo 47r \ 7?1S / 
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Is it a problem that m differs from mo by a divergent quantity? This question 
has two levels, those of concept and practice. 

On the conceptual level, we should fully expect the mass of the electron 
to be modified by its coupling to the electromagnetic field. In classical elec¬ 
trodynamics, the rest energy of any charge is increased by the energy of its 
electrostatic field, and this energy shift diverges in the case of a point charge: 



(7.30) 


In fact, it is puzzling why the divergence in (7.29) is so weak, logarithmic in 
A rather than linear as in (7.30). To understand this feature, suppose that mo 
were set to 0. Then the two helicity components of the electron field ipL and 
tpR would not be coupled by any term in the QED Hamiltonian. This would 
imply that perturbative corrections could never induce a coupling of tpL and 
ip ft , nor, in particular, an electron mass term. In other words, 6m must vanish 
when mo = 0. The mass shift must therefore be proportional to mo, and so, 
by dimensional analysis, it can depend only logarithmically on A. 

On a practical level, the infinite mass shift casts doubt on our perturbative 
calculations. For example, all of the theoretical results in Chapter 5 should 
technically involve mo rather than m. To compare theory to experiment we 
must eliminate mo in favor of m, using the relation mo = m + 0(a). Since the 
“small” 0{a) correction is infinite, the validity of this procedure is far from 
obvious. The validity of perturbation theory would be more plausible if we 
could compute Feynman diagrams using the propagator i/(j/— m), which has 
the correct pole location, instead of mo). In Chapter 10 we will see how 

to rearrange the perturbation series in such a way that mo is systematically 
eliminated in favor of m and the zeroth-order propagator has its pole at the 
physical mass. In the remainder of this chapter, we will continue to simply 
replace mo by m in expressions for order-a corrections. 

Finally, let us examine the perturbative correction to Z 2 . From (7.26), we 
find that the order-a correction SZ 2 = (Z 2 — 1) is 


SZ 2 = 


ri£ 3 


a 

2tt 


1 

h 


—x log 


xK 2 


(1 — x)' 2 7n' 2 + X/lr 


+ 2(2 r) 


x(l—x)fn 2 


(1 —x) 2 m 2 + x/j' 2 


(7.31) 


This expression is also logarithmically ultraviolet divergent. We will discuss 
the observability of this divergent term at the end of Section 7.2. However, it 
is interesting to note, even before that discussion, that (7.31) is very similar 
in form to the value of the ad hoc subtraction that we made in our calculation 
of the electron vertex correction in Section 6.3. From Eq. (6.56), the value of 
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this subtraction was 


SFi(O) =— I dx dy dz 5(x+y+z — 1) 
2tt J 


log( 


zK 2 


0 


+ 


(l-4:z+z 2 )m 2 


-T, Jw-mM 


(1 —z) 2 m 2 + z/j 2 J (1 —z) 2 m 2 + z/j' 2 \ 

zA 2 \ (1—4 z+z 2 )m 2 1 


(1 —z)' 2 m' 2 + zpr) (1 —z) 2 m 2 + z/j 2 J 


. (7.32) 


Using the integration by parts 


Jdz(l 2^) log( (1 _ z ) 2 m 2 + z/i2 ) 



2(1 —z)m 2 — jj 2 
(1 —z) 2 m 2 + zp 2 

{l—z)(l—z 2 )m 2 ' 
(1 —z) 2 m 2 + z/j 2 


it is not hard to show that SFi(O) + SZ 2 = 0. This identity will play a crucial 
role in justifying the ad hoc subtraction of Section 6.3. 


7.2 The LSZ Reduction Formula 


In the last section we saw that the Fourier transform of the two-point corre¬ 
lation function, considered as an analytic function of p 2 , has a simple pole at 
the mass of the one-particle state: 


/ 


d\rc' r " r (n\T(f>(x)(f>(0) |fi) 


iZ 


p 2 — m 2 + ie 


(7.33) 


(Here and throughout this section we use the symbol ~ to mean that the poles 
of both sides are identical; there are additional finite terms, given in this case 
by Eq. (7.9).) In this section we will generalize this result to higher correlation 
functions. We will derive a general relation between correlation functions and 
S'-matrix elements first obtained by Lehmann, Symanzik, and Zimmermann 
and known as the LSZ reduction formula .+ This result, combined with our 
Feynman rules for computing correlation functions, will justify Eq. (4.103), 
our master formula for S'-matrix elements in terms of Feynman diagrams. For 
simplicity, we will carry out the whole analysis for the case of scalar fields. 


+H. Lehmann, K. Symanzik, and W. Zimmermann, Nuovo Cimento 1, 1425 (1955). 
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The strategy of the argument will be as follows. To calculate the 5-matrix 
element for a 2-bodv —» /i-body scattering process, we begin with the corre¬ 
lation function of n + 2 Heisenberg fields. Fourier-transforming with respect 
to the coordinate of any one of these fields, we will find a pole of the form 
(7.33) in the Fourier-transform variable p 2 . We will argue that the one-particle 
states associated with these poles are in fact asymptotic states, that is, states 
given by the limit of well-separated wavepackets as they become concentrated 
around definite momenta. Taking the limit in which all n + 2 external parti¬ 
cles go on-shell, we can then interpret the coefficient of the multiple pole as 
an 5-matrix element. 

To begin, let us Fourier-transform the (n + 2)-point correlation function 
with respect to one argument x. We must then analyze the integral 

Jd 4 xe ipx (il T\o(x)6(z i )o(z4 ■ ■ •} |fi). 

We would like to identify poles in the variable p°. To do this, divide the 
integral over x° into three regions: 


T_ 


/*« = /*»+ /*» + /*», 
T_l T _ — oo 


(7.34) 


where T + is much greater than all the zf and T_ is much less than all the zf. 
Call these three intervals regions I, II, and III. Since region II is bounded and 
the integrand depends on p° through the analytic function exp(ip°;r°), the 
contribution from this region will be analytic in p°. However, regions I and 
III, which are unbounded, may develop singularities in p°. 

Consider first region I. Here x° is the latest time, so <f>(x) stands first in 
the time ordering. Insert a complete set of intermediate states in the form of 
(7.2): 

1 = ^ / (2tt) 3 2£ q (A) |Aq) {Aql ' 

The integral over region I then becomes 

OO 

dx° Jil : 'x( ■' ( '> x 
T+ 

X (A q | T {<j)(zi)<j)(z2) ■ ■ •} |H) . 


?/ 


d 3 q 


( 2 tt )3 2Eq(X) 


(H| <f>(. r) |A q ) 


(7.35) 


Using Eq. (7.4), 


-iq-x 


(f2| <p(x) |A q ) = (f2| 5(0) |A 0 ) e 


r /AyA;v 



224 


Chapter 7 Radiative Corrections: Some Formal Developments 


and including a factor e e ' l '° to insure that the integral is well defined, this 
integral becomes 




0 f A 1 


(2tt)3 2Eq(\) 


e ip X g—*4 * , •* (n|0(O)|Ao)(27r) 3 <5( 3) (p-q) 
x (A q | T{<p(zi) • • •} |f2) 

__, 1 ,: e i(p°-E p +ie')T + 

= |A »>< A -' T « 2 '> • ■ i°> • 1736) 


The denominator is just that of Eq. (7.5): p 2 — m\. There is an analytic 
singularity in p °; as in Section 7.1, this singularity will be either a pole or 
a branch cut depending upon whether or not the rest energy m\ is isolated. 
The one-particle state corresponds to an isolated energy value p° = E p = 
\/|p | 2 + m 2 , and at this point Eq. (7.36) has a pole: 

f tl'xrif ' /•{ o( . / .),>(~ | ) . . .} i>) 

J . (7.37) 

0 ~ --V--%/z(p|T{<pU)-..}|0). 

p°->+£ p p - — m - + ie J 


The factor \[Z is the same field strength renormalization factor as in Eq. (7.8), 
since it replaces the same matrix element as in (7.7). 

To evaluate the contribution from region III, we would put <f>{x) last in the 
operator ordering, then insert a complete set of states between T{(j>(zi) • • •} 
and 4>{x). Repeating the above argument produces a pole as p° —> —E p : 


Jd 4 xe ip " x <fi| T{<f>(x)4>(z 1 ) ■ • •} |fi) 





<n| t{4>(zi) •••}!-?) Vz -—- 

A J p- — m- + ze 


(7.38) 


Next we would like to Fourier-transform with respect to the remaining 
field coordinates. To keep the various external particles from interfering, how¬ 
ever, we must isolate them from each other in space. Let us therefore repeat 
the preceding calculation using a wavepacket rather than a simple Fourier 
transform. In Eq. (7.35), replace 

J d 4 xe ip , T-x J j d 4 x e ip ° x ° e~ ik -* cp( k), (7.39) 

where <p(k) is a narrow distribution centered on k = p. This distribution con¬ 
strains x to lie within a band, whose spatial extent is that of the wavepacket, 
about the trajectory of a particle with momentum p. With this modification, 
the right-hand side of (7.36) has a more complicated singularity structure: 



i 


2E k (A) p° — .E k (A) + ie 


<^| d>(0) |A 0 ) (A k | T {4>(zi) • • •} |0) 
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|n)> ( ' 40> 

where, in the second line, p = (po,k). The one-particle singularity is now a 
branch cut, whose length is the width in momentum space of the wavepacket 
</>(k). However, if <p(k) defines the momentum narrowly, this branch cut 
is very short, and (7.40) has a well-defined limit in which ip(k) tends to 
(27r) 3 t5 (3) (k — p) and the singularity of (7.40) sharpens up to the pole of 
(7.36). The singularity due to single-particle states in the far past, Eq. (7.38), 
is modified in the same way. 

Now consider integrating each of the coordinates in the (n + 2)-point 
correlation function against a wavepacket, to form* 

(il / - r-dk;.) <fi| r\o(.r\ IO(./--j) •••! <->). ( 7 . 41 ) 


The wavepackets should be chosen to overlap in a region around x = 0 and to 
separate in the far past and the far future. To analyze this integral, we choose 
a large positive time T + such that all of the wavepackets are well separated 
for x° > T+, and we choose a large negative time T_ such that all of the 
wavepackets are well separated for x® < T_. Then we can break up each of 
of the integrals over x ° into three regions as in (7.34). The integral of any x ° 
over the bounded region II leads to an expression analytic in the corresponding 
energy p' 0 , so we can concentrate on the case in which all of the x are placed 
at large past or future times. 

For definiteness, consider the contribution in which only two of the time 
coordinates, x\ and ;Co, are in the future. In this case, the fields (j>{x i) and 
4>{x-2) stand to the left of the other fields in time order. Inserting a complete 
set of states |Ak), the integrations in (7.41) over the coordinates of these two 
fields take the form 



rl 3 K 1 
(2/t) 3 2E k 


n 

(= 1,2 ' 


'• d : 'k, 
(2tt) 3 


/ 





x <n| /1o(.rj )')(./••_(i j |Ak) (Ak| t { 0 ( x 3 ) ■ ■ •} |fi). 


The state |Ak) is annihilated by two field operators constrained to lie in 
distant wavepackets. It must therefore consist of two distinct excitations of 
the vacuum at two distinct locations. If these excitations are well separated, 


*As in Section 4.5, tlie product symbol applies symbolically to the integrations as 
well as to the other factors within the parentheses; the Xf integrals apply to what lies 
outside the parentheses as well. 
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they should be independent of one another, so we can approximate 
d 3 I< 1 


E 


(27r) 3 2E K 
cfqi 


V' f d A qi 1 / 

h J ^ 2£ - J 


(e|r{0(.i: l )^(*2)} Pk) Ok 

cfq -2 1 


(2tt) 3 2£ q 


(0| <j>(x i) |A qi ) (0| <p(.T2) |Aq 2 ) <A qi A q 


The sums over Ai and Ao in the this equation run over all zero-momentum 
states, but only single-particle states will contribute the poles we are looking 
for. In this case, the integrals over .r? and qi produce a sharp singularity in 
Pi of the form of (7.40), and the integrals over ;r§ and qo produce the same 
singular behavior in p §. The term in (7.41) with both singularities is 

(n •••>»• 


In the limit in which the wavepackets tend to delta functions concentrated at 
definite momenta pi and P 2 , this expression tends to 


(,n ^-7=+,, • •••}». 

The state (P 1 P 2 I is precisely an out state as defined in Section 4.5, since it 
is the definite-momentum limit of a state of particles constrained to well- 
separated wavepackets. Applying the same analysis to the times ;r? in the far 
past gives the result that the coefficient of the maximally singular term in 
the corresponding p? is a matrix element with an in state. This most singular 
term in (7.41) thus has the form 


n 

*= 1.2 


PC 


n 


*= 3 , 


Pi 


,2 _ 


m 2 +ie 


• v^j out<PlP2 I-P3 ■■■)!„■ 


The last factor is just an 5-matrix element. 

We have now shown that we can extract the value of an 5-matrix ele¬ 
ment by folding the corresponding vacuum expectation value of fields with 
wavepackets, extracting the leading singularities in the energies p°, and then 
taking the limit as these wavepackets become delta functions of momenta. 
However, the computation would be made much simpler if we could do these 
operations in the reverse order—first letting the wavepackets become delta 
functions, returning us to the simple Fourier transform, and then extracting 
the singularities. In fact, the result for the leading singularity is not changed 
by this switch of the order of operations. It is, however, rather subtle to argue 
this point. Roughly, the explanation is the following: In the language of the 
analysis just completed, new singularities might arise because, in the Fourier 
transform, x\ and xo can become close together in the far future. However, 
in this region, the exponential factor is close to exp[-i(pi+p 2 ) • aq], and thus 
the new singularities are single poles in the variable {p\ +po)’ rather than 
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being products of poles in the two separate energy variables. A more care¬ 
ful argument (unfortunately, couched in a rather different language) can be 
found in the original paper of Lehmann, Symanzik, and Zimmermann cited 
at the beginning of this section. 

Given the ability to make this reversal in the order of operations, we 
obtain a precise relation between Fourier transforms of correlation functions 
and 5-matrix elements. This is the LSZ reduction formula: 


n 




(0| T{<t>{xi) ■ ■ ■ 4>{x n )<t>{yi) ■ ■ ■ 4>(y,n)} |0) 


each >+E Pi 
each 

j 

The quantity Z that appears in this equation is exactly the field-strength 
renormalization constant, defined in Section 7.1 as the residue of the single¬ 
particle pole in the two-point function of fields. Each distinct particle will 
have a distinct renormalization factor Z, obtained from its own two-point 
function. For higher-spin fields, each factor of \[Z comes with a polarization 
factor such as u s (p), as in Eq. (7.12). The polarization s must be summed 
over in the second line of (7.42). 

In words, the LSZ formula says that an 5-matrix element can be computed 
as follows. Compute the appropriate Fourier-transformed correlation function, 
look at the region of momentum space where the external particles are near 
the mass shell, and identify the coefficient of the multiparticle pole. For fields 
with spin, one must also multiply by a polarization spinor (like u s (p )) or 
vector (like e r (k)) to project out the desired spin state. 

Our next goal is to express this procedure in the language of Feynman 
diagrams. Let us analyze the relation between the diagrammatic expansion of 
the scalar field four-point function and the 5-matrix element for 2-particle —>■ 
2-particle scattering. We will consider explicitly the fully connected Feynman 
diagrams contributing to the correlator. By a similar analysis, it is easy to 
confirm that disconnected diagrams should be disregarded because they do 
not have the singularity structure, with a product of four poles, indicated on 
the right-hand side of (7.42). 

The exact four-point function 


n 

i= 1 


s/Zi 




n 


s/Zi 


k] —ni 2 +ie 

3 = 1 3 


(Pi • • -Pn| S |ki • • -k m ). 

(7.42) 



(^\T{4(x 1 )<f>(x2)<l>(yi)<l>(y2)} | 0 ) 


has the general form shown in Fig. 7.4. In this figure, we have indicated 
explicitly the diagrammatic corrections on each leg; the shaded circle in the 
center represents the sum of all amputated four-point diagrams. 

We can sum up the corrections to each external leg just as we did for the 
electron propagator in the previous section. Let —iM 2 (p 2 ) denote the sum of 
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Figure 7.4. Structure of the exact four-point function in scalar field theory, 
all one-particle-irreducible (1PI) insertions into the scalar propagator: 


Then the exact propagator can be written as a geometric series and summed 
as in Eq. (7.23): 


p 2 — nig 


+ 


p 2 — nig 


{ i.\l-) 


p 2 — nig 


+ 


p 2 — m'g — M 2 (p 2 )' 


(7.43) 


Notice that, as in the case of the electron propagator, our sign convention 
for the 1PI self-energy M 2 (p 2 ) implies that a positive contribution to M 2 (p 2 ) 
corresponds to a positive shift of the scalar particle mass. If we expand each 
resummed propagator about the physical particle pole, we see that each ex¬ 
ternal leg of the four-point amplitude contributes 


iZ 


p 2 — nig — M 2 p°^>E p p 2 — ni 


+ (regular). 


(7.44) 


Thus, the sum of diagrams contains a product of four poles: 

iZ iZ iZ iZ 


p\ — m 2 pi ~ m2 — m2 ^2 — m2 


This is exactly the singularity on the second line of (7.42). Comparing the 
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coefficients of this product of poles, we find the relation 


<p lP2 |S|k 1 k 2 ) = (Vz ) 4 


where the shaded circle is the sum of amputated four-point diagrams and Z 
is the field-strength renormalization factor. 

An identical analysis can be applied to the Fourier transform of the (n + 
2)-point correlator in a general field theory. The relation between 5-matrix 
elements and Feynman diagrams then takes the form 


<p 1 ...p„|5|k 1 k 2 ) = (n/z)” +2 . (7.45) 


(If the external particles are of different species, each has its own renormal¬ 
ization factor \[Z\ if the particles have nonzero spin, there will be additional 
polarization factors such as u s (k ) on the right-hand side.) This is almost pre¬ 
cisely the diagrammatic formula for the 5-matrix that we wrote down in 
Section 4.6. The only new feature is the appearance of the renormalization 
factors \fZ. The Z factors are irrelevant for calculations at the leading order 
of perturbation theory, but are important in the calculation of higher-order 
corrections. 

Up to this point, we have performed only one full calculation of a higher- 
order correction, the computation of the order-a corrections to the electron 
form factors. We did not take into account the effects of the electron field- 
strength renormalization. Let us now add in this factor and examine its effects. 

Since the expressions (6.28) and (6.30) for electron scattering from a heavy 
target were derived using our previous, incorrect formula for 5-matrix ele¬ 
ments, we should correct these formulae by inserting factors of \fZ 2 for the 
initial and final electrons. Equation (6.33) for the structure of the exact vertex 
should then read 

7 a 

Z. 2 V'(p',p) = YFi(q 2 ) + — —F 2 (q 2 ), (7.46) 

2m 

with r ^(p',p) the sum of amputated electron-photon vertex diagrams. 

We can use this equation to reevaluate the form factors to order a. Since 
Z -2 = 1 + O(a) and F 2 begins in order a, our previous computation of F 2 is 
unaffected. To compute iq, write the left-hand side of (7.46) as 

Z 2 T» = (1 + SZ 2 ){Y + 5T") = 7 " + dT" + 7 " • 8Z 2 , 

where 8Z 2 and denote the order-cn corrections to these quantities. Com¬ 
paring to the right-hand side of (7.46), we see that Fi(q' 2 ) receives a new 
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contribution equal to SZ 2 . Now let 8F\(q 2 ) denote the (unsubtracted) correc¬ 
tion to the form factor that we computed in Section 6.3, and recall from the 
end of Section 7.1 that SZ 2 = —8F\{ 0). Then 

F 1 (q 2 ) = 1 + SF^q 2 ) + SZ 2 = 1 + [8F 1 (q 2 ) - 8F 1 (0)]. 

This is exactly the result we claimed, but did not prove, in Section 6.3. The 
inclusion of field-strength renormalization justifies the subtraction procedure 
that we applied on an ad hoc basis there. 

At this level of analysis, it is difficult to see how the cancellation of di¬ 
vergences in Fi will persist to higher orders. Worse, though we argued in 
Section 6.3 for the general result Fj(0) = 1, our verification of this result in 
order a seems to depend on a numerical coincidence. 

We can state this problem carefully as follows: Define a second rescaling 
factor Z\ by the relation 

T»{q = 0) = ZfV, (7.47) 

where t 7 ' is the complete amputated vertex function. To find Fj(0) = 1, 
we must prove the identity Z\ = Z 2 , so that the vertex rescaling exactly 
compensates the electron field-strength renormalization. We will prove this 
identity to all orders in perturbation theory at the end Section 7.4. 

We conclude our discussion of the LSZ reduction formula with one fur¬ 
ther formal observation. The LSZ formula distinguishes in and out particles 
only by the sign of the Fourier transform momentum p? or k®. This means 
that, by analytically continuing the residue of the pole in p 2 from positive 
to negative p°, we can convert the 5-matrix element with d>(p) in the final 
state into the 5-matrix element with the antiparticle <p*{— p) in the initial 
state. This is exactly the statement of crossing symmetry , which we derived 
diagrammatically in Section 5.4: 

<• • • ^(p)| S\- ■ ■) \ p= _ h = (• • j 5 \4>*{k) ■ ■ ■). 

Since the proof of the LSZ formula does not depend on perturbation theory, we 
see that the crossing symmetry of the 5-matrix is a general result of quantum 
theory, not merely a property of Feynman diagrams. 


7.3 The Optical Theorem 

In Section 7.1 we saw that the two-point correlation function of quantum 
fields, viewed as an analytic function of the momentum p 2 , has branch cut 
singularities associated with multiparticle intermediate states. This conclusion 
should not be surprising to those familiar with nonrelativistic scattering the¬ 
ory, since it is already true there that the scattering amplitude, as a function 
of energy, has a branch cut on the positive real axis. The imaginary part of 
the scattering amplitude appears as a discontinuity across this branch cut. By 
the optical theorem, the imaginary part of the forward scattering amplitude is 
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Figure 7.5. The optical theorem: The imaginary part of a forward scattering 
amplitude arises from a sum of contributions from all possible intermediate- 
state particles. 

proportional to the total cross section. We will now prove the field-theoretic 
version of the optical theorem and illustrate how it arises in Feynman diagram 
calculations. 

The optical theorem is a straightforward consequence of the unitarity of 
the S'-matrix: S^S = 1. Inserting S = 1 + iT as in (4.72), we have 

i(T - T f ) = T f T. (7.48) 

Let us take the matrix element of this equation between two-particle states 
IP 1 P 2 ) and |kik 2 ). To evaluate the right-hand side, insert a complete set of 
intermediate states: 

( PlP2 | TtT |k 1 k 2 ) = £ (n/^r-TTy) <P 1 P 2 | | {q*}) <{q,:}| T l^k,). 

Now express the T-matrix elements as invariant matrix elements M times 
4-momentum-conserving delta functions. Identity (7.48) then becomes 

- i[M{kikn -> P1P2) - M*{pip2 -> kiko)] 

= H(fl /^3 2^“) 7W * (plP2 {- k})M{kik- 2 -I {%}) 

x (27 t) 4 <5 (4) (fci +fc 2 — Eft), 

i 

times an overall delta function ( 2 tt) 4 S ^ 4 ' 1 (ki+k2—pi—p2)- Let us abbreviate 
this identity as 

-i [M(a -)■ b) - M*(b -> a)] = ^ /<01/ M*(b -> /)A4(a -> /), (7.49) 

/ J 

where the sum runs over all possible sets / of final-state particles. Although 
we have so far assumed that a and b are two-particle states, they could equally 
well be one-particle or multiparticle asymptotic states. 

For the important special case of forward scattering, we can set p-, = 
kj to obtain a simpler identity, shown pictoriallv in Fig. 7.5. Supplying the 
kinematic factors required by (4.79) to build a cross section, we obtain the 
standard form of the optical theorem, 

Im M(ki , fco -> k\ , fco) = 2 E cm p cm atot (ki , k 2 -> anything), 


(7.50) 



232 


Chapter 7 Radiative Corrections: Some Formal Developments 


where E cm is the total center-of-mass energy and p Bm is the momentum of ei¬ 
ther particle in the center-of-mass frame. This equation relates the forward 
scattering amplitude to the total cross section for production of all final states. 
Since the imaginary part of the forward scattering amplitude gives the atten¬ 
uation of the forward-going wave as the beam passes through the target, it is 
natural that this quantity should be proportional to the probability of scat¬ 
tering. Identity (7.50) gives the precise connection. 

The Optical Theorem for Feynman Diagrams 

Let us now investigate how this identity for the imaginary part of an 5- 
matrix element arises in the Feynman diagram expansion. It is easily checked 
(in QED, for example) that each diagram contributing to an 5-matrix element 
M is purely real unless some denominators vanish, so that the ie prescription 
for treating the poles becomes relevant. A Feynman diagram thus yields an 
imaginary part for M only when the virtual particles in the diagram go on- 
shell. We will now show how to isolate and compute this imaginary part. 

For our present purposes, let us define M by the Feynman rules for per¬ 
turbation theory. This allows us to consider M(s) as an analytic function of 
the complex variable s = E'f m , even though 5-matrix elements are defined 
only for external particles with real momenta. 

We first demonstrate that the appearance of an imaginary part of M(s) 
always requires a branch cut singularity. Let s 0 be the threshold energy for 
production of the lightest multiparticle state. For real s below so the interme¬ 
diate state cannot go on-shell, so M(s) is real. Thus, for real s < s 0 , we have 
the identity 

M(s) = [Af(s*)] *. (7.51) 

Each side of this equation is an analytic function of s, so it can be analytically 
continued to the entire complex s plane. In particular, near the real axis for 
s > so, Eq. (7.51) implies 

ReyVl(s' + ie) = ReyV[(s — ie); 

Im M (s + ie) = — Im M(s — ie). 

There is a branch cut across the real axis, starting at the threshold energy s 0 ; 
the discontinuity across the cut is 

Disc M(s) = 2 i Im M(s + ie). 

Usually it is easier to compute the discontinuity of a diagram than to compute 
the imaginary part directly. The ie prescription in the Feynman propagator 
indicates that physical scattering amplitudes should be evaluated above the 
cut, at s + ie. 

We already saw in Section 7.1 that the electron self-energy diagram has 
a branch cut beginning at the physical electron-photon threshold. Let us now 
study more general one-loop diagrams, and show that their discontinuities 
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give precisely the imaginary parts required by (7.49). The generalization of 
this result to multiloop diagrams has been proven by Cutkosky, 1 ' who showed 
in the process that the discontinuity of a Feynman diagram across its branch 
cut is always given by a simple set of cutting rules.* 

We begin by checking (7.49) in <j > 4 theory. Since the right-hand side of 
(7.49) begins in order A 2 , we expect that ImyVf should also receive its first 
contribution from higher-order diagrams. Consider, then, the order-A 2 dia¬ 
gram 


with a loop in the s-channel. (It is easy to check that the corresponding t- and 
w-channel diagrams have no branch cut singularities for s above threshold.) 
The total momentum is k = k\ + k- 2 , and for simplicity we have chosen the 
symmetrical routing of momenta shown above. The value of this Feynman 
diagram is 


A^ f _1_1_ 

2 J (27t) 4 (k/2 — q ) 2 — m 2 + ie (k/2 + q ) 2 — m 2 + it 


(7.52) 


When this integral is evaluated using the methods of Section 6.3, the Wick 
rotation produces an extra factor of i, so that, below threshold, SM is purely 
real. 

We would like to verify that the integral (7.52) has a discontinuity across 
the real axis in the physical region k° > 2m. It is easiest to identify this 
discontinuity by computing the integral for k° < 2m, then increasing k° by 
analytic continuation. It is not difficult to compute the integral directly using 
Feynman parameters (see Problem 7.1). However, it is illuminating to use a 
less direct approach, as follows. 

Let us work in the center-of-mass frame, where k = (k°, 0 ). Then the 
integrand of (7.52) has four poles in the integration variable q °, at the locations 


q° = \k° ± (Eq - ie), q° = —\k° ± (£l q - ie). 


lR. E. Cutkosky, J. Math. Phvs. 1, 429 (1960). 

+ These rules are simple only for singularities in tlie physical region. Away from 
the physical region, the singularities of three- and liigher-point amplitudes can become 
quite intricate. This subject is reviewed in R. J. Eden, P. V. Landslioff, D. I. Olive, 
and J. C. Polkinghorne, The Analytic S-Matrix (Cambridge University Press, 1966). 
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Two of these poles lie above the real q° axis and two lie below, as shown: 


We will close the integration contour downward and pick up the residues of the 
poles in the lower half-plane. Of these, only the pole at q° = —(1/2 )k° + £q 
will contribute to the discontinuity. Note that picking up the residue of this 
pole is equivalent to replacing 


1 

(k/‘2 + q)' 2 — m 2 + ie 


—> —2iri8((k/2 + q) 2 — m 2 ) 


(7.53) 


under the dq° integral. 

The contribution of this pole yields the integral 


iSA4 = — 2iri 


X 2 


1 


d 3 q 1 

( 2 tt ) 4 2E^W 


1 

e^-ei 


- 2m 2 (2tt) 4 / djBqjBq l q l 


1 1 

2£qfc°(fc° - 2£q)' 


(7.54) 


The integrand in the second line has a pole at _E q = k°/2. When k° < 2 to, 
this pole does not lie on the integration contour, so 8M is manifestly real. 
When k° > 2m, however, the pole lies just above or below the contour of 
integration, depending upon whether k° is given a small positive or negative 
imaginary part: 


Thus the integral acquires a discontinuity between k 2 + ie and k 2 — ie. To 
compute this discontinuity, apply 


k° — 2 ± ie 


= P 


1 


k° - 2E C 


T iTrS(k° - 2Eq) 


(where P denotes the principal value). The discontinuity is given by replacing 
the pole with a delta function. This in turn is equivalent to replacing the 
original propagator by a delta function: 


1 


(k/2 — q) 2 — m 2 + ie 


—> —2iriS((k/2 — q) 2 — m 2 ). 


(7.55) 
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Figure 7.6. Two contributions to the optical theorem for Bliabha scattering. 

Let us now retrace our steps and see what we have proved. Go back to 
the original integral (7.52), relabel the momenta on the two propagators as 
Pi, p -2 and substitute 



We have shown that the discontinuity of the integral is computed by replacing 
each of the two propagators by a delta function: 

—---:-» —2iriS(pj — m 1 2 ). (7.56) 

The discontinuity of M comes only from the region of the d 4 q integral in which 
the two delta functions are simultaneously satisfied. By integrating over the 
delta functions, we put the momenta pi on shell and convert the integrals 
1 1''Pi into integrals over relativistic phase space. What is left over in expres¬ 
sion (7.52) is just the factor A 2 , the square of the leading-order scattering 
amplitude, and the symmetry factor (1/2), which can be reinterpreted as the 
symmetry factor for identical bosons in the final state. Thus we have shown 
that, to order A 2 on each side of the equation, 


Disc M(k) 


2 i Im M(k) 


1 f d 3 pi 1 d a p-2 1 

2 J (2tt) 3 ■2E 1 (2;r) 3 2£b 


M(k ) ~(2tt) 4 6^ 4 \pi + p2 ~k). 


This explicitly verifies (7.49) to order A 2 in cp 4 theory. 

The preceding argument made no essential use of the fact that the two 
propagators in the diagram had equal masses, or of the fact that these propa¬ 
gators connected to a simple point vertex. Indeed, the analysis can be applied 
to an arbitrary one-loop diagram. Whenever, in the region of momentum in¬ 
tegration of the diagram, two propagators can simultaneously go on-shell, we 
can follow the argument above to compute a nonzero discontinuity of M. 
The value of this discontinuity is given by making the substitution (7.56) for 
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each of the two propagators. For example, in the order-ar Bhabha scattering 
diagrams shown in Fig. 7.6, we can compute the imaginary parts by cutting 
through the diagrams as shown and putting the cut propagators on shell using 
(7.56). The poles of the additional propagators in the diagrams do not con¬ 
tribute to the discontinuities. By integrating over the delta functions as in the 
previous paragraph, we derive the indicated relations between the imaginary 
parts of these diagrams and contributions to the total cross section. 

Cutkosky proved that this method of computing discontinuities is com¬ 
pletely general. The physical discontinuity of any Feynman diagram is given 
by the following algorithm: 

1. Cut through the diagram in all possible ways such that the cut propaga¬ 
tors can simultaneously be put on shell. 

2. For each cut, replace 1 /(jr—mr+ie) —> —2tt i5{p 2 —m 2 ) in each cut prop¬ 
agator, then perform the loop integrals. 

3. Sum the contributions of all possible cuts. 

Using these cutting rules, it is possible to prove the optical theorem (7.49) to 
all orders in perturbation theory. 

Unstable Particles 

The cutting rules imply that the generalized optical theorem (7.49) is true 
not only for 5-matrix elements, but for any amplitudes M. that we can define 
in terms of Feynman diagrams. This fact is extremely useful for dealing with 
unstable particles, which never appear in asymptotic states. 

Recall from Eq. (7.43) that the exact two-point function for a scalar par¬ 
ticle has the form 

i 

p 2 — ml — M 2 (p 2 )' 

We defined the quantity —iM 2 (p 2 ) as the sum of all 1PI insertions into the 
boson propagator, but we can equally well think of it as the sum of all am¬ 
putated diagrams for 1-particle —>■ 1-particle “scattering”. The LSZ formula 
then implies 

M ( p p) = - ZM 2 (p 2 ). (7.57) 

We can use this relation and the generalized optical theorem (7.49) to discuss 
the imaginary part of M 2 (p 2 ). 

First consider the familiar case where the scalar boson is stable. In this 
case, there is no possible final state that can contribute to the right-hand side 
of (7.49). Thus M 2 (p 2 ) is real. The position of the pole in the propagator is 
determined by the equation m 2 — m'l — M 2 (rrr) = 0, which has a real-valued 
solution to. The pole therefore lies on the real p 2 axis, below the multiparticle 
branch cut. 

Often, however, a particle can decay into two or more lighter particles. 
In this case M 2 (p 2 ) will acquire an imaginary part, so we must modify our 
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definitions slightly. Let us define the particle’s mass m by the condition 

m 2 - ml - R eM 2 (m 2 ) = 0. (7.58) 

Then the pole in the propagator is displaced from the real axis: 


iZ 

p 2 — m' 2 — iZlmM 2 (p 2 ) 


If this propagator appears in the s channel of a Feynman diagram, the cross 
section one computes, in the vicinity of the pole, will have the form 

|2 


<7 (X 


1 


(7.59) 


| s — m? — iZ Im M 2 (s ) | 

This expression closely resembles the relativistic Breit-Wigner formula (4.64) 
for the cross section in the region of a resonance: 

|2 


a oc 


1 


p 2 — m 2 + imT 


(7.60) 


The mass m defined by (7.58) is the position of the resonance. If Im M 2 (m 2 ) is 
small, so that the resonance in (7.59) is narrow, we can approximate Im M 2 (s) 
as Im M 2 (m 2 ) over the width of the resonance; then (7.59) will have precisely 
the Breit-Wigner form. In this case, we can identify 

r = -—Im M' 2 {m 2 ). (7.61) 

m 

If the resonance is broad, it will show deviations from the Breit-Wigner shape, 
generally becoming narrower on the leading edge and broader on the trailing 
edge. 

To compute Im M 2 , and hence F. we could use the definition of M 2 as the 
sum of 1PI insertions into the propagator. The imaginary parts of the relevant 
loop diagrams give the decay rate. But the optical theorem (7.49), generalized 
to Feynman diagrams by the Cutkosky rules, simplifies this procedure. If we 
take (7.57) as the definition of the matrix element M(p —)• p), and similarly 
define the decay matrix elements M (p —t /) through their Feynman diagram 
expansions, then (7.49) implies 

Z Im M 2 (p 2 ) = -lmM(p p) = ~ JZ / dIl f I M (P /) T, (7-62) 

2 f ' 


where the sum runs over all possible final states /. The decay rate is therefore 

r =^E [dIlf\M(p^f)\ 2 , (7.63) 

/ J 


as quoted in Eq. (4.86). 

We stress once again that our derivation of this equation applies only 
to the case of a long-lived unstable particle, so that T m. For a broad 
resonance, the full energy dependence of M 2 (p 2 ) must be taken into account. 
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7.4 The Ward-Takahashi Identity 

Of the loose ends listed at the beginning of this chapter, only one remains, the 
proof of the Ward identity. Recall from Section 5.5 that this identity states 
the following: If M(k) = is the amplitude for some QED process 

involving an external photon with momentum k, then this amplitude vanishes 
if we replace with k^: 


k„M ft (k ) = 0. (7.64) 

To prove this assertion, it is useful to prove a more general identity for QED 
correlation functions, called the Ward-Takahashi identity. To discuss this more 
general case we will let M denote a Fourier-transformed correlation function, 
in which the external momenta are not necessarily on-shell. The right-hand 
side of (7.64) will contain nonzero terms in this case; but when we apply the 
LSZ formula to extract an S'-matrix element, those terms will not contribute. 

We will prove the Ward-Takahashi identity order by order in a, by looking 
directly at the Feynman diagrams that contribute to M(k). The identity is 
generally not true for individual Feynman diagrams; we must sum over the 
diagrams for M(k) at any given order. 

Consider a typical diagram for a typical amplitude M(k ): 


If we remove the photon y(fc), we obtain a simpler diagram which is part 
of a simpler amplitude Mo- If we reinsert the photon somewhere else inside 
the simpler diagram, we again obtain a contribution to M(k). The crucial 
observation is that by summing over all the diagrams that contribute to Mo, 
then summing over all possible points of insertion in each of these diagrams, 
we obtain M(k). The Ward-Takahashi identity is true individually for each 
diagram contributing to Mo, once we sum over insertion points; this is what 
we will prove. 

When we insert our photon into one of the diagrams of Mo, it must attach 
either to an electron line that runs out of the diagram to two external points, 
or to an internal electron loop. Let us consider each of these cases in turn. 

First suppose that the electron line runs between external points. Before 
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we insert our photon 7 (k), the line looks like this: 


The electron propagators have momenta p , p\ = p + qi, P 2 = Pi + q -2 , and so 
on up to p' = p n -i + q n . If there are n vertices, we can insert our photon in 
n + 1 different places. Suppose we insert it after the ith vertex: 


The electron propagators to the left of the new photon then have their mo¬ 
menta increased by k. 

Let us now look at the values of these diagrams, with the polarization 
vector e fl (k) replaced by £ 7 . The product of k fl with the new vertex is conve¬ 
niently written: 

—iek^f' 1 = -ie [(^, : + 1/ - m) - (ff, : - m)]. 

Multiplying by the adjacent electron propagators, we obtain 

i (-i e u\ 1 =e ( _J_ i 

rfi+ty—my ) ^ m tfj+tf—m 

The diagram with the photon inserted at position i therefore has the structure 


(7.65) 


•i+i+V-m 


V A ' + 1 



x 





Similarly, the diagram with the photon inserted at position i — 1 has the 
structure 


= ...( _*_ 1 ( —L _ 1 j Xi 

\l/i+i + ht-mj ytfi + tf-mj 

x ( _ L _ L _Va- . ... 

Vy,:-1-TO 01 i+V mj 

Note that the first term of this expression cancels the second term of the 
previous expression. A similar cancellation occurs between any other pair of 
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diagrams with adjacent insertions. When we sum over all possible insertion 
points along the line, everything cancels except the unpaired terms at the 
ends. The unpaired term coming from insertion after the last vertex (on the 
far left) is just e times the value of the original diagram; the other unpaired 
term, from insertion before the first vertex, is identical except for a minus sign 
and the replacement of p by p + k everywhere. Diagrammatically, our result 
is 


(7.66) 


where we have renamed p' + k —> q for the sake of symmetry. 

In each diagram on the left-hand side of (7.66), the momentum entering 
the electron line is p and the momentum exiting is q. According to the LSZ 
formula, we can extract from each diagram a contribution to an 5-matrix 
element by taking the coefficient of the product of poles 

-J-] 

4~m ) \jf-m 

The terms on the right-hand side of (7.66) each contain one of these poles, 
but neither contains both poles. Thus the right-hand side of (7.66) contributes 
nothing to the 5-matrix.* 

To complete the proof of the Ward-Takahashi identity, we must consider 
the case in which our photon attaches to an internal electron loop. Before the 
insertion of the photon, a typical loop looks like this: 


*This step of the argument is straightforward only if we have arranged the per¬ 
turbation series so that the propagator contains m rather than mg- We will do this in 
Chapter 10. 
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The electron propagators have momenta p\, p\ + q -2 = p- 2 , and so on up to p n . 
Suppose now that we insert the photon 7 (k) between vertices i and i + 1: 


We now have an additional momentum k running around the loop from the 
new vertex; by convention, this momentum exits at vertex 1 . 

To evaluate the sum over all such insertions into the loop, apply iden¬ 
tity (7.65) to each diagram. For the diagram in which the photon is inserted 
between vertices 1 and 2 , we obtain 



The first term will be canceled by one of the terms from the diagram with 
the photon inserted between vertices 2 and 3. Similar cancellations take place 
between terms from other pairs of adjacent insertions. When we sum over all 
n insertion points we are left with 



(7.67) 

Shifting the integration variable from p\ to p\ + k in the second term, we see 
that the two remaining terms cancel. Thus the diagrams in which the photon 
is inserted along a closed loop add up to zero. 

We are now ready to assemble the pieces of the proof. Suppose that the 
amplitude M has 2 n external electron lines, n incoming and n outgoing. Label 
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the incoming momenta p.j and the outgoing momenta r/,: 


M{k]pi ■ ■ ■ p n ; q 1 ■ ■ ■ q n ) = 


(The amplitude can also involve an arbitrary number of additional external 
photons.) The amplitude Mo lacks the photon 7 (k) but is otherwise identical. 
To form k^M 11 from Mo we must sum over all diagrams that contribute to 
Mo, and for each diagram, sum over all points at which the photon could be 
inserted. Summing over insertion points along an internal loop in any diagram 
gives zero. Summing over insertion points along a through-going line in any 
diagram gives a contribution of the form (7.66). Summing over all insertion 
points for any particular diagram, we obtain 


where the shaded circle represents any particular diagram that contributes 
to Mo- Summing over all such diagrams, we finally obtain 

k fl M >l (k-,p 1 ■■■Pn',qi--- qn) = e^2^Mo{pi ■ ■ -Pr&qi • • ■ ( qt~k ) • • •) 

* (7.68) 

-M 0 (pi ■ ■ ■ (Pi+k ) • • •; qi ■ ■ ■ q„) j. 

This is the Ward-Takahashi identity for correlation functions in QED. We saw 
below (7.66) that the right-hand side does not contribute to the 5-matrix; thus 
in the special case where M is an 5-matrix element, Eq. (7.68) reduces to the 
Ward identity (7.64). 

Before discussing this identity further, we should mention a potential flaw 
in the above proof. In order to find the necessary cancellation in Eq. (7.67), 
we had to shift the integration variable by a constant. If the integral diverges, 
however, this shift is not permissible. Similarly, there may be divergent loop- 
momentum integrals in the expressions leading to Eq. (7.66). Here there is 
no explicit shift in the proof, but in practice one does generally perform a 
shift while evaluating the integrals. In either case, ultraviolet divergences can 
potentially invalidate the Ward-Takahashi identity. We will see an example of 
this problem, as well as a general solution to it, in the next section. 
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The simplest example of the Ward-Takahashi identity involves, on the left- 
hand side, the three-point function with one entering and one exiting electron 
and one external photon: 


The quantities on the right-hand side are exact electron propagators, evalu¬ 
ated at p and (p + k) respectively. Label these quantities S(p) and S(p + /,•); 
from Eq. (7.23), 

S(p) = 7- —vT\' 

The full three-point amplitude on the left-hand side can be rewritten, just 
as in Eq. (7.44), as a product of full propagators for the entering and exiting 
electrons, times an amputated scattering diagram. In this case, the amputated 
function is just the vertex T IJ (p -f k,p). Then the Ward-Takahashi identity 
reads: 

S(p + k ) [-iek,V‘{p + k,p)\S(p ) = e(S(p) - S(p + k)). 

To simplify this equation, multiply, on the left and right respectively, by the 
Dirac matrices S~ 1 (p + k) and S -1 (p). This gives 

-ik^(p + k,p) = S-Hp + k) - S-^p). (7.69) 


Often the term Ward-Takahashi identity is used to mean only this special 
case. 

We can use identity (7.69) to obtain the general relation between the 
renormalization factors Zi and Z- 2 . We defined Z\ in (7.47) by the relation 

T ; '(p + k,p) —> Zj -1 as k —> 0. 


We defined Z 2 as the residue of the pole in S(p): 


S(p) ~ 


iZ-2 
— m 


Setting p near mass shell and expanding (7.69) about k = 0, we find for the 
first-order terms on the left and right 

-iZi 1 y= -iZZ 1 !/, 


that is, 


Z\ — Z-2- 


(7.70) 


Thus, the Ward-Takahashi identity guarantees the exact cancellation of infi¬ 
nite rescaling factors in the electron scattering amplitude that we found at 
the end of Section 7.2. When combined with the correct formal expression 
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(7.46) for the electron form factors, this identity guarantees that iq(0) = 1 
to all orders in perturbation theory. 

Often, in the literature, the terms Ward identity , current conservation , 
and gauge invariance are used interchangeably. This is quite natural, since 
the Ward identity is the diagrammatic expression of the conservation of the 
electric current, which is in turn a consequence of gauge invariance. In this 
book, however, we will distinguish these three concepts. By gauge invariance 
we mean the fundamental symmetry of the Lagrangian; by current conserva¬ 
tion we mean the equation of motion that follows from this symmetry; and 
by the Ward identity we mean the diagrammatic identity that imposes the 
symmetry on quantum mechanical amplitudes. 


7.5 Renormalization of the Electric Charge 

At the beginning of Chapter 6 we set out to study the order-a radiative 
corrections to electron scattering from a heavy target. We evaluated (at least 
in the classical limit) the bremsstrahlung diagrams, 


and also the corrections due to virtual photons: 


Our discussion of field-strength renormalization in this chapter has finally 
clarified the role of the last two diagrams. In particular, we have seen that 
the Ward identity, through the relation Z\ = Z- 2 , insures that the sum of the 
virtual photon corrections vanishes as the momentum transfer q goes to zero. 

There is one remaining type of radiative correction to this process: 


This is the order-a vacuum polarization diagram, also known as the photon 
self-energy. It can be viewed as a modification to the photon structure by a 
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virtual electron-positron pair. This diagram will alter the effective field A ,J (x) 
seen by the scattered electron. It can potentially shift the overall strength of 
this field, and can also change its dependence on x (or in Fourier space, on 
q). In this section we will compute this diagram, and see that it has both of 
these effects. 

Overview of Charge Renormalization 

Before beginning a detailed calculation, let’s ask what kind of an answer we 
expect and what its interpretation will be. The interesting part of the diagram 
is the electron loop: 


= (-ze) 2 (-l) 
(</)■ 


’ d 4 k 


tr 


= inf 


■ + i- m\ 

(7.71) 


(The fermion loop factor of ( — 1) was derived in Eq. (4.120).) More generally, 
let us define iIP"(g) to be the sum of all 1-particle-irreducible insertions into 
the photon propagator, 


= ffl'Hg), 


(7.72) 


so that n(f(g) is the second-order (in e) contribution to I P"(g). 

The only tensors that can appear in II''"(</) are and g''g". The Ward 
identity, however, tells us that q jJ W 11 '(q) = 0. This implies that IP"(g) is 
proportional to the projector (g f “' — q^q v /q 2 ). Furthermore, we expect that 
IP"(g) will not have a pole at q 2 = 0; the only obvious source of such a pole 
would be a single-massless-particle intermediate state, which cannot occur in 
any 1PI diagram.! It is therefore convenient to extract the tensor structure 
from IP" in the following way: 

U^(q) = (q 2 g^ -q^q^Uiq 2 ), (7.73) 

where II (q 2 ) is regular at q 2 = 0. 

Using this notation, the exact photon two-point function is 


= + -^r- [i(<7V CT - q p q a WQ 2 )} + ■ ■ • 


fOne can prove that there is no such pole, but the proof is nontrivial. Schwinger 
has shown that, in two spacetime dimensions, the singularity in IIo due to a pair of 
massless fermions is a pole rather than a cut; this is a famous counterexample to our 
argument. There is no such problem in four dimensions. 
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Wfiv 


+ 




A£n (g 2 ) + 




A£A"II 2 (g 2 ) + 


where A p = S p — q p q l ,/q 2 . Noting that A£A£ = A p , we can simplify this 
expression to 




q 


2 


+ 



) (n(g 2 ) + n 2 (g 2 ) + • • •) 


-i ( _ qpQv 

q 2 (1 — II(g 2 )) q 2 



(7.74) 


In any 5-matrix element calculation, at least one end of this exact prop¬ 
agator will connect to a fermion line. When we sum over all places along the 
line where it could connect, we must find, according to the Ward identity, 
that terms proportional to q^ or q v vanish. For the purposes of computing 
5-matrix elements, therefore, we can abbreviate 


g 2 (l — II(g 2 )) 


(7.75) 


Notice that as long as II(g 2 ) is regular at g 2 = 0, the exact propagator always 
has a pole at g 2 = 0. In other words, the photon remains absolutely massless 
at all orders in perturbation theory. This claim depends strongly on our use of 
the Ward identity in (7.73). If, for example, IT'^g) contained a term M 2 g pv 
(with no compensating g''g" term), the photon mass would be shifted to M. 
The residue of the g 2 = 0 pole is 

--—T = ^3- 

1-11(0) 

The amplitude for any low-g 2 scattering process will be shifted by this factor, 
relative to the tree-level approximation: 


or 



Since a factor of e lies at each end of the photon propagator, we can con¬ 
veniently account for this shift by making the replacement e —>• \fZz e. This 
replacement is called charge renormalization ; it is in many ways analogous to 
the mass renormalization introduced in Section 7.1. Note in particular that 
the “physical” electron charge measured in experiments is e. We will 
therefore shift our notation and call this quantity simply e. From now on we 
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will refer to the “bare” charge (the quantity that multiplies in the 

Lagrangian) as e<> We then have 


(physical charge) = e = \fz~ 3 e o = \fz% • (bare charge). (7.76) 

To lowest order, Z 3 = 1 and e = eo- 

In addition to this constant shift in the strength of the electric charge, 
n(g 2 ) has another effect. Consider a scattering process with nonzero q 2 , and 
suppose that we have computed II (q 2 ) to leading order in a: II (q 2 ) ss n 2 (g 2 ). 
The amplitude for the process will then involve the quantity 


tiv | 

f e o ^ 

1 _ Wnf | 

( ^ 

r 

U-n (q 2 )) 

O(a) q 2 

U - [n 2 (r) -n 2 (o)]y 


(Swapping e 2 for e q does not matter to lowest order.) The quantity in paren¬ 
theses can be interpreted as a q 2 -dependent electric charge. The full effect of 
replacing the tree-level photon propagator with the exact photon propagator 
is therefore to replace 


ao ~f Cteff(<T) 


eo/47r a 

1 IK'/ 2 ) m 1- [iW)-n 2 ( 0 )]' 


(7.77) 


(To leading order we could just as well bring the II-terms into the numerator; 
but we will see in Chapter 12 that in this form, the expression is true to all 
orders when II 2 is replaced by II.) 


Computation of II 2 

Having worked so hard to interpret n 2 (g 2 ), we had better calculate it. Going 
back to (7.71), we have 


inf(g) 




i(+ m) v 
k ' 2 — m 2 ^ 


i(ty+ <j( + m) 

(k + q ) 2 — m 2 


r d 4 k k»{k+q) v + k v {k+qY - g^ v (k-{k+q) - m 2 ) 
J (27t) 4 (k 2 — m 2 ) ((k+q )' 2 — m 2 ) 


(7.78) 


We have written e and m instead of eo and mo for convenience, since the 
difference would give only an order-cr contribution to W'. 

Now introduce a Feynman parameter to combine the denominator factors: 


(k 2 — m 2 ) ((k+q ) 2 — m 2 ) 


= / clx 


1 


/« 


= / dx 


(k 2 + 2 xk ■ q + xq 2 — m 2 ) 2 

1 


(l 2 + x(l—x)q 2 — m 2 ) 


2 > 
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where ( = k + xq. In terms of l, the numerator of (7.78) is 

Numerator = 2£ ,J ( l/ — g'^l 2 — 2x(l—x)q , ‘q 1 ' + (m 2 + x{l —x)q 2 ) 

+ (terms linear in t). 


Performing a Wick rotation and substituting l 0 = i(° E , we obtain 


iU^(q) = -4-ie 2 j dx j 
o 

-hg^^E + g^pE - 2:c(l-;c)g"g" + g^ (to 2 + :r(l-;c)g 2 ) 

(4 +A) 2 


(7.79) 


where A = m 2 — x(l—x)q 2 . This integral is very badly ultraviolet divergent. 
If we were to cut it off at (e = A, we would find for the leading term, 


mr(g) oc e 2 AV", 


with no compensating q l 'q 1 ' term. This result violates the Ward identity; it 
would give the photon an infinite mass M oc eA. 

Our proof of the Ward identity has failed, in precisely the way anticipated 
at the end of the previous section: The shift of the integration variable in (7.67) 
is not permissible when the integral is divergent. In our present calculation, 
the failure of the Ward identity is catastrophic: It leads to an infinite photon 
mass,* in conflict with experiment. Fortunately, there is a way to rescue the 
Ward identity. 

In the above analysis we regulated the divergent integral in the most 
straightforward and most naive way: by cutting it off at a large momentum A. 
But other regulators are possible, and some will in fact preserve the Ward iden¬ 
tity. In our computations of the vertex and electron self-energy diagrams, we 
used a Pauli-Villars regulator. This regulator preserved the relation Z\ = Z 2 , 
a consequence of the Ward identity; a naive cutoff does not (see Problem 7.2). 
We could fix our present computation by introducing Pauli-Villars fermions. 
Unfortunately, several sets of fermions are required, making the method rather 
complicated.* We will use a simpler method, dimensional regularization, due 
to ’t Hooft and Veltman.* Dimensional regularization preserves the symme¬ 
tries of QED and also of a wide class of more general theories. 

The question of which regulator to use has no a priori answer in quantum 
field theory. Often the choice has no effect on the predictions of the theory. 


+We could still make the observed photon mass zero by adding a compensating 
infinite photon mass term to the Lagrangian. More generally, we could add terms to 
the Lagrangian to make the Ward identity valid for any n-point correlation function. 
This procedure would give the same results as the one we are about to follow, but 
would be much more complicated. 

*Tliis method is presented in Bjorken and Drell (1964), p. 154. 

*G. ’t Hooft and M. J. G. Veltman, Nucl. Phvs. B44, 189 (1972). 
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When two regulators give different answers for observable quantities, it is gen¬ 
erally because some symmetry (such as the Ward identity) is being violated 
by one (or both) of them. In these cases we take the symmetry to be funda¬ 
mental and demand that it be preserved by the regulator. But the validity of 
this choice cannot be proven; we are adopting the symmetry as a new axiom. 


Dimensional Regularization 

The idea of dimensional regularization is very simple to state: Compute the 
Feynman diagram as an analytic function of the dimensionality of space- 
time, cl. For sufficiently small d, any loop-momentum integral will converge 
and therefore the Ward identity can be proved. The final expression for any 
observable quantity should have a well-defined limit as d —)- 4. 

Let us do a practice calculation to see how this technique works. We 
consider spacetime to have one time dimension and (d— 1) space dimensions. 
Then we can Wick-rotate Feynman integrals as before, to give integrals over 
a d-dimensional Euclidean space. A typical example is 


1 

1—1 

f dn d r 

/( 2tt)M4 + A) 2 j 

1 (2n) d J 


cIIe 


4 _1 




E+^y 


(7.80) 


The first factor in (7.80) contains the area of a unit sphere in d dimensions. 
To compute it, use the following trick: 


(v^) d = ( / dxe~ x2 ^ = jcl d x exp(- £ .rj) 

OC 

H'/ 


dxx d ~ 1 -- x 


x e — 


~ / dn “ / 
o 

= (Jdn d ^j ■ ±r(d/2). 


d(x' 2 )(x 2 )i 1 e ( x2 


So the area of a d-dimensional unit sphere is 

27T d / 2 


/ 


dll— 


r(d/2)' 

This formula reproduces the familiar special cases: 


(7.81) 


d 

T{d/2) 

fdn d 

1 


2 

2 

1 

2 tt 

3 

x/tt/2 

47r 

4 

1 

2 tt 2 
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The second factor in (7.80) is 



where we have substituted x = A /(f 2 -f A) in the second line. Using the 
definition of the beta function, 


J dxx a 1 (l-x) 13 1 = B(a,f3) = (7.82) 

o 


we can easily evaluate the integral over x. The final result for the ri-dimensional 
integral is 

[d d ( E 1 1 T(2 -f)/lx2 -f 

./ (27r) d (4 + A) 2 (47r) d / 2 T(2) VA/ 

Since T(z) has isolated poles at z = 0, —1, —2,..., this integral has 
isolated poles at d = 4. 6, 8 ,.... To find the behavior near d = 4, define 
e = 4 — d, and use the approximation* 

T(2-f) = r(e/2) = ^-.?y + 0{e), (7.83) 

where 7 » .5772 is the Euler-Mascheroni constant. (This constant will always 
cancel in observable quantities.) The integral is then 


d d h 


7 T 2 ) 2 " (f - log A - 7 + log( 47 r) + 0{e) j . (7.84) 


J (27r) d (4 + A) 2 d^A (47t) 

When we defined this integral with a Pauli-Villars regulator in Eq. (7.18), we 
found 

d 4 ( E 1 1 /, :r A 2 


/i 


(27r) 4 (4 + A ) 2 A—foo (47 t) 


^(logiA + OIA-,). 


Thus the 1/e pole in dimensional regularization corresponds to a logarith¬ 
mic divergence in the momentum integral. Note the curious fact that (7.84) 


+ Tliis expansion follows immediately from the infinite product representation 



e~ z ! n . 
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involves the logarithm of A, a dimensionful quantity. The scale of the loga¬ 
rithm is hidden in the 1/e term, and appears explicitly when the divergence 
is canceled. 

You can easily verify the more general integration formulae, 


/ 

/ 


d d t E 1 

(2n) d (4+ A)" 

d d e E 4 

(2n) d (4+A)" 


_1_ T(n-f) / 1 yn~f 

(4 7 r )d/2 r(n) VA/ 

i dr(n-f-i) /lyi-ff - 1 

(47r) d / 2 2 T(n) VA/ 


(7.85) 

(7.86) 


In d dimensions, g pv obeys g^g pl ' = d. Thus, if the numerator of a symmetric 
integrand contains l p £ v , we should replace 


tn v 

a 


(7.87) 


in analogy with Eq. (6.46). In QED, the Dirac matrices can be manipulated 
as a set of d matrices satisfying 


{ 7 ", 7*} = 2<T, tr [1] = 4. (7.88) 

In manipulating Eq. (7.78), these rules give the same result as the purely 
four-dimensional rules. However, in the evaluation of other diagrams, there 
are additional contributions of order e. In particular, the contraction identities 
(5.9) are modified in d = 4 — e to 

t'VTm = -(2 - 

7 ^ V Y% = 4 g vp ~ eYY (7.89) 

l 1 * l” 1 P 1° = —2~j a ~i p • 

These extra terms can contribute to the final value of the Feynman diagram 
if they multiply a factor e -1 from a divergent integral. In QED at one-loop 
order, such extra terms appear in the vertex and self-energy diagrams but 
cancel when these diagrams are combined to compute an observable quantity. 


Computation of II 2 , Continued 


Now let us apply these dimensional regularization formulae to the momentum 
integral in (7.79). The unpleasant terms with £ 2 in the numerator give 


fd d £$ : (-f + lV"T| 
J (2ny (4 +A) 2 


1.i-i 

_(i-^)r,i-j)(_) V" 


We would have expected a pole at d = 2, since the quadratic divergence in 4 
dimensions becomes a logarithmic divergence in 2 dimensions. But the pole 
cancels. The Ward identity is working. 
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Evaluating the remaining terms in (7.79) and using A = m 2 — x(l—x)q 2 , 
we obtain 


fflf(g) = —4ie : 


1 

I th 


1 r( 2 -f) 

( 47r )d/2 £2-d/2 


x W‘ v { 
= (q 2 g> ,M - 


— m 2 + x(l—x)q 2 ) + g ,iv (m 2 

-q^n-in 2 (q 2 ), 


+ x{l—x)q 2 ) — 2x(l—x)q M q 1 '] 


where 


n 2(q 2 ) = 


— 8 e 2 

(4ir) d / : 


1 

l dxx{ l - x) 


r(2-j) 

A 2 “ d / 2 


(7.90) 


->• - 
d —>4 


2a f (2 \ 

— dxx(l—x){^ -log A — 7 + log( 47 r)J 


(e = 4 — cl). 


With dimensional regularization, no^g) indeed takes the form required by 
the Ward identity. But it is still logarithmically divergent. 

We can now compute the order-a shift in the electric charge: 

= = 11,(0) ss-A 

et O(a) 37T6 


The bare charge is infinitely larger than the observed charge. But this dif¬ 
ference is not observable. What can be observed is the q 2 dependence of the 
effective electric charge (7.77). This quantity depends on the difference 


n 2 (r) ^ n 2 (q 2 ) - n 2 (o) = Jdxx(i-x) iog( m2 _ ^_ a)g3 ), (7.9i) 

0 

which is independent of e in the limit e —>■ 0. For the rest of this section we 
will investigate what physics this expression contains. 


Interpretation of II 2 

First consider the analytic structure of n 2 (g 2 ). For q 2 < 0, as is the case when 
the photon propagator is in the t- or w,-channel, n 2 (g 2 ) is manifestly real and 
analytic. But for an s-channel process, q 2 will be positive. The logarithm 
function has a branch cut when its argument becomes negative, that is, when 

to 2 — :r(l— a ;)^ 2 < 0 . 

The product x(l—x) is at most 1/4, so II 2 ((/ 2 ) has a branch cut beginning at 

q 2 = 4m 2 , 

at the threshold for creation of a real electron-positron pair. 
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Let us calculate the imaginary part of IE for q 2 > 4m 2 . For any fixed q 2 , 
the x-values that contribute are between the points x = -7 ± where 3 = 
sjl — 4m 2 /q 2 . Since Im[log(— X ± ie)\ = ± 7 r, we have 


Im[lE ((/ 2 ± ie)] = —— (± 7 r) j dxx(l-x) 


d/2 

= ^2a j dy{\-y 2 ) 
-d/2 



(y = x - |) 


(7.92) 


This dependence on q' 2 is exactly the same as in Eq. (5.13), the cross section for 
production of a fermion-antifermion pair. That is just what we would expect 
from the unitarity relation shown in Fig. 7.6(b); the cut through the diagram 
for forward Bhabha scattering gives the total cross section for e + e _ —>• //. 
The parameter 3 is precisely the velocity of the fermions in the center-of-mass 
frame. 

Next let us examine how IIo)^ 2 ) modifies the electromagnetic interaction, 
as determined by Eq. (7.77). In the nonrelativistic limit it makes sense to 
compute the potential V(r). For the interaction between unlike charges, we 
have, in analogy with Eq. (4.126), 


F(x) 


f e *qx -e 2 

j (2*) 3 |q|2[l - n 2 ( —| q |2)] ' 


Expanding IE for |(/ 2 | <§; m 2 , we obtain 


F(x) 


a 

r 


4a 2 

15m 2 


(x). 


(7.93) 


(7.94) 


The correction term indicates that the electromagnetic force becomes much 
stronger at small distances. This effect can be measured in the hydrogen atom, 
where the energy levels are shifted by 


AE =/*l*wi ! (-iS^ ,,w ) = - i£b,o) 


The wavefunction ?il(x) is nonzero at the origin only for s-wave states. For the 
25 state, the shift is 


A E = - 


4a 2 


3 3 

am 


a 5 m 


= -1.123 x 10 -7 eV. 


15m 2 87 t 307t 

This is a (small) part of the Lamb shift splitting listed in Table 6.1. 



254 


Chapter 7 Radiative Corrections: Some Formal Developments 


Figure 7.7. Contour for evaluating the effective strength of the electromag¬ 
netic interaction in the nonrelativistic limit. The pole at Q = i/.i gives the 
Coulomb potential. The branch cut gives the order-a correction due to vac¬ 
uum polarization. 


The delta function in Eq. (7.94) is only an approximation; to find the true 
range of the correction term, we write Eq. (7.93) in the form 


E(x) 



Q e iQr 
Q 2 + T 


[i + n 2 (-Q 2 )] 


(Q = |q|)> 


where we have inserted a photon mass // to regulate the Coulomb potential. To 
perform this integral we push the contour upward (see Fig. 7.7). The leading 
contribution comes from the pole at Q = i/i, giving the Coulomb potential, 
—a/r. But there is an additional contribution from the branch cut, which 
begins at Q = 2mi. The real part of the integrand is the same on both sides 
of the cut, so the only contribution to the integral comes from the imaginary 
part of n 2 . Defining q = — iQ , we find that the contribution from the cut is 

OO 

— P 2 f P~ qV ^ 

SV(r) = -2 dq - - Im[n 2 (g 2 - it)] 

(2tt )-r Jq 

2m 


a 2 
r 7r 


oc 

/ 


dq ■ 


2m 


q 3 


' 1 - 


4m 2 


1 + 


‘2m- 


When r 2>> 1/m, this integral is dominated by the region where q & 2m. 
Approximating the integrand in this region and substituting t = q — 2m, we 
find 

, a 2 f e- (i+2m)r a FT /3\ 


a a e 


2 m 

—2 mr 


r As/n (mr) 3 /' 2 ' 
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Figure 7.8. Virtual e + e pairs are effectively dipoles of length ~ 1 /m, 
which screen the bare charge of the electron. 


so that 

rv / a 6 ~ 2mr \ 

V(r) = — (l + —j=— -^7 + •••). (7.95) 

r \ 4 \/ 7 r (mr)- s /- j > 

Thus the range of the correction term is roughly the electron Compton wave¬ 
length, 1/m. Since hydrogen wavefunctions are nearly constant on this scale, 
the delta function in Eq. (7.94) was a good approximation. The radiative 
correction to V(r) is called the Uehling potential. 

We can interpret the correction as being due to screening. At r ^ 1/m, 
virtual e + e _ pairs make the vacuum a dielectric medium in which the apparent 
charge is less than the true charge (see Fig. 7.8). At smaller distances we begin 
to penetrate the polarization cloud and see the bare charge. This phenomenon 
is known as vacuum polarization. 

Now consider the opposite limit: small distance or — q 2 m 2 . Equation 

(7.91) then becomes 


n 2 (r) ~ “ y dxx(l-x) [log(-^-) +log(x(l-x)) + 
o 



The effective coupling constant in this limit is therefore 


aeff(<r) — 



(7.96) 


where A = exp(5/3). The effective electric charge becomes much larger 
at small distances, as we penetrate the screening cloud of virtual electron- 
positron pairs. 
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Figure 7.9. Differential cross section for Bhabha scattering, e + e _ -5- e + e _ , 
at E cm = 29 GeV, as measured by the HRS collaboration, M. Derrick, et. ah, 
Phvs. Rev. D34, 3286 (1986). The upper curve is the order-cv 2 prediction 
derived in Problem 5.2, plus a very small (*v2%) correction due to the weak 
interaction. The lower curve includes all QED radiative corrections to order 
a 3 except the vacuum polarization contribution; note that these corrections 
depend on the experimental conditions, as explained in Chapter 6. The middle 
curve includes the vacuum polarization contribution as well, which increases 
the effective value of a 2 by about 10% at this energy. 

The combined vacuum polarization effect of the electron and of heavier 
quarks and leptons causes the value of a e s(q 2 ) to increase by about 5% from 
q = 0 to q = 30 GeV, and this effect is observed in high-energy experiments. 
Figure 7.9 shows the cross section for Bhabha scattering at E cm = 29 GeV, 
and a comparison to QED with and without the vacuum polarization diagram. 

We can write a e ff as a function of r by setting q = l/r. The behavior of 
a e ff(r) for all r is sketched in Fig. 7.10. The idea of a distance-dependent (or 
“scale-dependent” or “running”) coupling constant will be a major theme of 
the rest of this book. 
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Figure 7.10. A qualitative sketch of the effective electromagnetic coupling 
constant generated by the one-loop vacuum polarization diagram, as a func¬ 
tion of distance. The horizontal scale covers many orders of magnitude. 


Problems 


7.1 In Section 7.3 we used an indirect method to analyze the one-loop .s-cliannel 
diagram for boson-boson scattering in A 1 theory. To verify our indirect analysis, eval¬ 
uate all three one-loop diagrams, using the standard method of Feynman parameters. 
Check the validity of the optical theorem. 


7.2 Alternative regulators in QED. In Section 7.5, we saw that the Ward identity 
can be violated by an improperly chosen regulator. Let us check the validity of the 
identity Z\ = Z-i, to order cv, for several choices of the regulator. We have already 
verified that the relation holds for Pauli-Villars regularization. 

(a) Recompute 5Z\ and <5^2, defining the integrals (6.49) and (6.50) by simply plac¬ 
ing an upper limit A on the integration over Show that, with this definition, 
SZ\ yf 5Zo. 

(b) Recompute dZ\ and SZ?,, defining the integrals (6.49) and (6.50) by dimensional 
regularization. You may take the Dirac matrices to be 4 x 4 as usual, but note 
that, in d dimensions, 

9/j = d. 

Show that, with this definition, 5Z\ = SZn. 


7.3 Consider a theory of elementary fermions that couple both to QED and to a 
Yukawa field <f>\ 


H- mt = 


j‘ d> '’ ¥ ’ + /' 


d A x e A^ f-'Af- 


(a) Verify that the contribution to Z\ from the vertex diagram with a virtual cp 
equals the contribution to Zo from the diagram with a virtual <p. Use dimensional 
regularization. Is the Ward identity generally true in this theory? 
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(b) Now consider the renormalization of the </>?/>?/> vertex. Show that the rescaling 
of this vertex at q 2 = 0 is not canceled by the correction to Z-i. (It suffices to 
compute the ultraviolet-divergent parts of the diagrams.) In this theory, the ver¬ 
tex and field-strength rescaling give additional shifts of the observable coupling 
constant relative to its bare value. 
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Radiation of Gluon Jets 


Although we have discussed QED radiative corrections at length in the last 
two chapters, so far we have made no attempt to compute a full radiatively 
corrected cross section. The reason is of course that such calculations are quite 
lengthy. Nevertheless it would be dishonest to pretend that one understands 
radiative corrections after computing only isolated effects as we have done. 
This “final project” is an attempt to remedy this situation. The project is the 
computation of one of the simplest, but most important, radiatively corrected 
cross sections. You should finish Chapter 6 before starting this project, but 
you need not have read Chapter 7. 

Strongly interacting particles—pions, kaons, and protons—are produced 
in e + e _ annihilation when the virtual photon creates a pair of quarks. If one 
ignores the effects of the strong interactions, it is easy to calculate the total 
cross section for quark pair production. In this final project, we will analyze 
the first corrections to this formula due to the strong interactions. 

Let us represent the strong interactions by the following simple model: 
Introduce a new massless vector particle, the gluon , which couples universally 
to quarks: 

AH = J 

Here / labels the type (“flavor”) of the quark (u, d, s, c, etc.) and i = 1,2,3 
labels the color. The strong coupling constant g is independent of flavor and 
color. The electromagnetic coupling of quarks depends on the flavor, since the 
u and c quarks have charge Qf = +2/3 while the d and s quarks have charge 
Qf = —1/3. By analogy to a , let us define 


In this exercise, we will compute the radiative corrections to quark pair pro¬ 
duction proportional to a g . 

This model of the strong interactions of quarks does not quite agree with 
the currently accepted theory of the strong interactions, quantum chromody¬ 
namics (QCD). However, all of the results that we will derive here are also 
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correct in QCD with the replacement 

4 

Otg —> 

We will verify this claim in Chapter 17. 

Throughout this exercise, you may ignore the masses of quarks. You may 
also ignore the mass of the electron, and average over electron and positron 
polarizations. To control infrared divergences, it will be necessary to assume 
that the gluons have a small nonzero mass //, which can be taken to zero 
only at the end of the calculation. However (as we discussed in Problem 5.5), 
it is consistent to sum over polarization states of this massive boson by the 
replacement: 

->■ -g v ‘ v \ 

this also implies that we may use the propagator 


I-1 

n 1 ' b v 


~uf v 

k 2 — /d 2 + it 


(a) Recall from Section 5.1 that, to lowest order in a and neglecting the 
effects of gluons, the total cross section for production of a pair of quarks 
of flavor / is 

a(e + e qq) =■ 3Q~ f . 

Compute the diagram contributing to e + e _ —>- qq involving one virtual 
gluon. Reduce this expression to an integral over Feynman parameters, 
and renormalize it by subtraction at q 2 = 0, following the prescription 
used in Eq. (6.55). Notice that the resulting expression can be considered 
as a correction to Fi(q 2 ) for the quark. Argue that, for massless quarks, 
to all orders in a g , the total cross section for production of a quark pair 
unaccompanied by gluons is 

cr(e+e ->• qq) = — -3 lq(<r = s) , 

3s 1 1 

with Fi(q 2 =0) = Qf- 

(b) Before we attempt to evaluate the Feynman parameter integrals in part 
(a), let us put this contribution aside and study the process e + e _ —>■ 
qqg, quark pair production with an additional gluon emitted. Before we 
compute the cross section, it will be useful to work out some kinematics. 
Let q be the total 4-momentum of the reaction, let k\ and ko be the 4- 
momenta of the final quark and antiquark, and let k% be the 4-momentum 
of the gluon. Define 

2 h ■ q 


Xi = 


i — 1,2,3; 
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this is the ratio of the center-of-mass energy of particle i to the maximum 
available energy. Then show (i) x i = 2, (ii) all other Lorentz scalars 
involving only the final-state momenta can be computed in terms of the 
Xi and the particle masses, and (iii) the complete integral over 3-bodv 
phase space can be written as 


/ du 3 = n 


d 3 kj 1 


( 2 ir) 4 S< 4 H q - Eki) = 


128t r 3 


/ 


dx 1 dx- 2 . 


Find the region of integration for x\ and To if the quark and antiquark 
are massless but the gluon has mass /i. 

(c) Draw the Feynman diagrams for the process e + e _ — > qqg , to leading 
order in a and a g , and compute the differential cross section. You may 
throw away the information concerning the correlation between the initial 
beam axis and the directions of the final particles. This is conveniently 
done as follows: The usual trace tricks for evaluating the square of the 
matrix element give for this process a result of the structure 

J dH 3 i £ \M \ 2 = L, a1 , J dH 3 H>‘\ 

where L represents the electron trace and H , “' represents the quark 
trace. If we integrate over all parameters of the final state except x\ and 
xn, which are scalars, the only preferred 4-vector characterizing the final 
state is q fl . On the other hand, H satisfies 


q»H, v = H^q" = 0 . 


Why is this true? (There is an argument based on general principles; 
however, you might find it a useful check on your calculation to verify 
this property explicitly.) Since, after integrating over final-state vectors, 
f H depends only on q 14 and scalars, it can only have the form 

J dn 3 h iiv = — ~f~) • h, 

where H is a scalar. With this information, show that 

L llv JdH 3 H" 1 ' = 1 {g^L^) ■ JdH 3 [g pa H pa ). 


Using this trick, derive the differential cross section 


da 


dx\dx‘. 


■{e + e 


-> qqg) = 


47 va 2 
3s 


■3 Q 2 f 


xi + xi 


2n (1-.ti)( 1—.To) 


in the limit g, —» 0. If we assume that each original final-state particle is 
realized physically as a jet of strongly interacting particles, this formula 
gives the probability for observing three jet events in e + e _ annihilation 
and the kinematic distribution of these events. The form of the distribu¬ 
tion in the t,; is an absolute prediction, and it agrees with experiment. The 
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normalization of this distribution is a measure of the strong-interaction 
coupling constant. 

(d) Now replace // ^ 0 in the formula of part (c) for the differential cross 
section, and carefully integrate over the region found in part (b). You 
may assume /r <Cf. In this limit, you will find infrared-divergent terms 
of order log (q 2 /g 2 ) and also log 2 (g 2 //r), finite terms of order 1, and 
terms explicitly suppressed by powers of {fi 2 /q 2 ). You may drop terms 
of the last type throughout this calculation. For the moment, collect and 
evaluate only the infrared-divergent terms. 

(e) Now analyze the Feynman parameter integral obtained in part (a), again 
working in the limit /r -C q 2 . Note that this integral has singularities in 
the region of integration. These should be controlled by evaluating the 
integral for q spacelike and then analytically continuing into the physical 
region. That is, write Q 2 = — q 2 , evaluate the integral for Q 2 > 0, and 
then carefully analytically continue the result to Q 2 = —q 2 — ie. Combine 
the result with the answer from part (d) to form the total cross section for 
e + e _ —t strongly interacting particles, to order a g . Show that all infrared- 
divergent logarithms cancel out of this quantity, so that this total cross 
section is well-defined in the limit p, —)■ 0. 

(f) Finally, collect the terms of order 1 from the integrations of parts (d) and 
(e) and combine them. To evaluate certain of these terms, you may find 
the following formula useful: 

l 

log (1— x) IT 2 

x 6 

o 



(It is not hard to prove this.) Show that the total cross section is given, 
to this order in a g , by 


cr(e + e 


-> qq or qqg) = 


47rcr 
3s 


1 




This formula gives a second way of measuring the strong-interaction cou¬ 
pling constant. The experimental results agree (within the current exper¬ 
imental errors) with the results obtained by the method of part (c). We 
will discuss the measurement of a s more fully in Section 17.6. 
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Invitation: Ultraviolet Cutoffs 
and Critical Fluctuations 


The main purpose of Part II of this book is to develop a general theory of 
renormalization. This theory will explain the origin of ultraviolet divergences 
in field theory and will indicate when these divergences can be removed sys¬ 
tematically. It will also give a way to convert the divergences of Feynman 
diagrams from a problem into a tool. We will apply this tool to study the 
asymptotic large- or small-momentum behavior of field theory amplitudes. 

When we first encountered an ultraviolet divergence in the calculation of 
the one-loop vertex correction in Section 6.3, it seemed an aberration that 
ought to disappear before it caused us too much discomfort. In Chapter 7 we 
saw further examples of ultraviolet-divergent diagrams, enough to convince us 
that such divergences occur ubiquitously in Feynman diagram computations. 
Thus it is necessary for anyone studying field theory to develop a point of 
view toward these divergences. Most people begin with the belief that any 
theory that contains divergences must be nonsense. But this viewpoint is 
overly restrictive, since it excludes not only quantum field theory but even 
the classical electrodynamics of point particles. 

With some experience, one might adopt a more permissive attitude of 
peaceful coexistence with the divergences: One can accept a theory with di¬ 
vergences, as long as they do not appear in physical predictions. In Chapter 7 
we saw that all of the divergences that appear in the one-loop radiative cor¬ 
rections to electron scattering from a heavy target can be eliminated by con¬ 
sistently eliminating the bare values of the mass and charge of the electron in 
favor of their measured physical values. In Chapter 10, we will argue that all 
of the ultraviolet divergences of QED, in all orders of perturbation theory, can 
be eliminated in this way. Thus, as long as one is willing to consider the mass 
and charge of the electron as measured parameters, the predictions of QED 
perturbation theory will always be free of divergences. We will also show in 
Chapter 10 that QED belongs to a well-defined class of field theories in which 
all ultraviolet divergences are removed after a fixed small number of physical 
parameters are taken from experiment. These theories, called renormalizable 
quantum field theories, are the only ones in which perturbation theory gives 
well-defined predictions. 

Ideally, though, one should take the further step of trying to understand 


265 



266 


Chapter 8 Invitation: Ultraviolet Cutoffs and Critical Fluctuations 


physically why the divergences appear and why their effects are more se¬ 
vere in some theories than in others. This direct approach to the divergence 
problem was pioneered in the 1960s by Kenneth Wilson. The crucial insights 
needed to solve this problem emerged from a correspondence, discovered by 
Wilson and others, between quantum field theory and the statistical physics 
of magnets and fluids. Wilson’s approach to renormalization is the subject 
of Chapter 12. The present chapter gives a brief introduction to the issues 
in condensed matter physics that have provided insight into the problem of 
ultraviolet divergences. 


Formal and Physical Cutoffs 

Ultraviolet divergences signal that quantities calculated in a quantum field 
theory depend on some very large momentum scale, the ultraviolet cutoff. 
Equivalently, in position space, divergent quantities depend on some very 
small distance scale. 

The idea of a small-distance cutoff in the continuum description of a sys¬ 
tem occurs in classical field theories as well. Typically the cutoff is at the 
scale of atomic distances, where the continuum description no longer applies. 
However, the size of the cutoff manifests itself in certain parameters of the 
continuum theory. In fluid dynamics, for instance, parameters such as the 
viscosity and the speed of sound are of just the size one would expect by com¬ 
bining typical atomic radii and velocities. Similarly, in a magnet, the magnetic 
susceptibility can be estimated by assuming that the energy cost of flipping 
an electron spin is on the order of a tenth of an eV, as we would expect from 
atomic physics. Each of these systems possesses a natural ultraviolet cutoff 
at the scale of an atom; by understanding the physics at the atomic scale, we 
can compute the parameters that determine the physics on larger scales. 

In quantum field theory, however, we have no precise knowledge of the 
fundamental physics at very short distance scales. Thus, we can only measure 
parameters such as the physical charge and mass of the electron, not compute 
them from first principles. The presence of ultraviolet divergences in the rela¬ 
tions between these physical parameters and their bare values is a sign that 
these parameters are controlled by the unknown short-distance physics. 

Whether we know the fundamental physics at small distance scales or 
not, we need two kinds of information in order to write an effective theory for 
large-distance phenomena. First, we must know how many parameters from 
the small distance scale are relevant to large-distance physics. Second, and 
more importantly, we must know what degrees of freedom from the underlying 
theory appear at large distances. 

In fluid mechanics, it is something of a miracle, from the atomic point of 
view, that any large-distance degrees of freedom even exist. Nevertheless, the 
equations that express the transport of energy and mass over large distances 
do have smooth, coherent solutions. The large-distance degrees of freedom are 
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the flows that transport these conserved quantities, and sound waves of long 
wavelength. 

In quantum field theory, the large-distance physics involves only those 
particles that have masses that are very small compared to the fundamental 
cutoff scale. These particles and their dynamics are the quantum analogues of 
the large-scale flows in fluid mechanics. The simplest way to naturally arrange 
for such particles to appear is to make use of particles that naturally have zero 
mass. So far in this book, we have encountered two types of particles whose 
mass is precisely zero, the photon and the chiral fermion. (In Chapter 11 
we will meet one further naturally massless particle, the Goldstone boson.) 
We might argue that QED exists as a theory on scales much larger than 
its cutoff because the photon is naturally massless and because the left- and 
right-handed electrons are very close to being chiral fermions. 

There is another way that particles of zero or almost zero mass can arise 
in quantum field theory: We can simply tune the parameters of a scalar field 
theory so that the scalar particles have masses small compared to the cut¬ 
off. This method of introducing particles with small mass seems arbitrary 
and unnatural. Nevertheless, it has an analogue in statistical mechanics that 
is genuinely interesting in that discipline and can teach us some important 
lessons. 

Normally, in a condensed matter system, the thermal fluctuations are 
correlated only over atomic distances. Under special circumstances, however, 
they can have much longer range. The clearest example of this phenomenon 
occurs in a ferromagnet. At high temperature, the electron spins in a magnet 
are disorganized and fluctuating; but at low temperature, these spins align to 
a fixed direction.* Let us think about how this alignment builds up as the 
temperature of the magnet is lowered. As the magnet cools from high tem¬ 
perature, clusters of correlated spins become larger and larger. At a certain 
point—the temperature of magnetization—the entire sample becomes a sin¬ 
gle large cluster with a well-defined macroscopic orientation. Just above this 
temperature, the magnet contains large clusters of spins with a common orien¬ 
tation, which in turn belong to still larger clusters, such that the orientations 
on the very largest scale are still randomized through the sample. This situ¬ 
ation is illustrated in Fig. 8.1. Similar behavior occurs in the vicinity of any 
other second-order phase transition, for example, the order-disorder transi¬ 
tion in binary alloys, the critical point in fluids, or the superfluid transition 
in Helium-4. 

The natural description of these very long wavelength fluctuations is in 
terms of a fluctuating continuum field. At the lowest intuitive level, we might 


*In a real ferromagnet, the long-range magnetic dipole-dipole interaction causes 
the state of uniform magnetization to break up into an array of magnetic domains. 
In this book, we will ignore this interaction and think of a magnetic spin as a pure 
orientation. It is this idealized system that is directly analogous to a quantum field 
theory. 
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Figure 8.1. Clusters of oriented spins near the critical point of a ferromag- 

net. 

substitute quantum for statistical fluctuations and try to describe this sys¬ 
tem as a quantum field theory. In Section 9.3 we will derive a somewhat more 
subtle relation that makes a precise connection between the statistical and 
the quantum systems. Through this connection, the behavior of any statis¬ 
tical system near a second-order phase transition can be translated into the 
behavior of a particular quantum field theory. This quantum field theory has 
a field with a mass that is very small compared to the basic atomic scale and 
that goes to zero precisely at the phase transition. 

But this connection seems to compound the problem of ultraviolet diver¬ 
gences in quantum field theory: If the wealth of phase transitions observed in 
Nature generates a similar wealth of quantum field theories, how can we pos¬ 
sibly define a quantum field theory without detailed reference to its origins in 
physics at the scale of its ultraviolet cutoff? Saying that a quantum field the¬ 
ory makes predictions independent of the cutoff would be equivalent to saying 
that the statistical fluctuations in the neighborhood of a critical point are in¬ 
dependent of whether the system is a magnet, a fluid, or an alloy. But is this 
statement so obviously incorrect? By reversing the logic, we would find that 
quantum field theory makes a remarkably powerful prediction for condensed 
matter systems, a prediction of universality for the statistical fluctuations 
near a critical point. In fact, this prediction is verified experimentally. 

A major theme of Part II of this book will be that these two ideas—cutoff 
independence in quantum field theory and universality in the theory of critical 
phenomena—are naturally the same idea, and that understanding either of 
these ideas gives insight into the other. 
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Landau Theory of Phase Transitions 


To obtain a first notion of what could be universal in the phenomena of phase 
transitions, let us examine the simplest continuum theory of second-order 
phase transitions, due to Landau. 

First we should review a little thermodynamics and clarify our nomen¬ 
clature. In thermodynamics, a first-order phase transition is a point across 
which some thermodynamic variable (the density of a fluid, or the magneti¬ 
zation of a ferromagnet) changes discontinuously. At a phase transition point, 
two quite distinct thermodynamic states (liquid and gas, or magnetization 
parallel and antiparallel to a given axis) are in equilibrium. The thermody¬ 
namic quantity that changes discontinuously across the transition, and that 
characterizes the difference of the two competing phases, is called the order 
parameter. In most circumstances, it is possible to change a second thermo¬ 
dynamic parameter in such a way that the two competing states move closer 
together in the thermodynamic space, so that at some value of this parameter, 
these two states become identical and the discontinuity in the order parame¬ 
ter disappears. This endpoint of the line of first-order transitions is called a 
second-order phase transition, or, more properly, a critical point. Viewed from 
the other direction, a critical point is a point at which a single thermodynamic 
state bifurcates into two macroscopically distinct states. It is this bifurcation 
that leads to the long-ranged thermal fluctuations discussed in the previous 
section. 

A concrete example of this behavior is exhibited by a ferromagnet. Let us 
assume for simplicity that the material we are discussing has a preferred axis 
of magnetization, so that at low temperature, the system will have its spins 
ordered either parallel or antiparallel to this axis. The total magnetization 
along this axis, M. is the order parameter. At low temperature, application 
of an external magnetic field H will favor one or the other of the two possible 
states. At H = 0, the two states will be in equilibrium; if II is changed from 
a small negative to a small positive value, the thermodynamic state and the 
value of M will change discontinuously. Thus, for any fixed (low) temperature, 
there is a first-order transition at H = 0. Now consider the effect of raising 
the temperature: The fluctuation of the spins increases and the value of \M\ 
decreases. At some temperature Tq the system ceases to be magnetized at 
H = 0. At this point, the first-order phase transition disappears and the 
two competing thermodynamic states coalesce. The system thus has a critical 
point at T = Tc- The location of these various transitions in the H-T plane 
is shown in Fig. 8.2. 

Landau described this behavior by the use of the Gibbs free energy G ; 
this is the thermodynamic potential that depends on M and T, such that 


OG 

ZTw 


= H. 

T 


( 8 . 1 ) 


He suggested that we concentrate our attention on the region of the critical 
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Figure 8.2. Phase diagram in the H-T plane for a uniaxial ferromagnet. 

point: T ta '!};•, M ps 0. Then it is reasonable to expand G(M) as a Taylor 
series in M. For H = 0, we can write 

G(M ) = A(T) + B(T)M 2 + C(T)M 4 + • • •. (8.2) 

Because the system has a symmetry under M —>■ —M, G(M ) can contain only 
even powers of M. Since M is small, we will ignore the higher terms in the 
expansion. Given Eq. (8.2), we can find the possible values of M at H = 0 by- 
solving 

dC 

0 = — = 2 B{T)M + 4 C{T)M 3 . (8.3) 

If B and C are positive, the only solution is M = 0. However, if C > 0 but 
B is negative below some temperature Tc , we have a nontrivial solution for 
T < Tc, as shown in Fig. 8.3. More concretely, approximate for T ps Tc- 

B(T)=b(T-T c ), C(T) = c. (8.4) 

Then the solution to Eq. (8.3) is 

f 0 for T > T c ; 

M={ r nl/ o (8.5) 

\±[(6/2c)(T c -T)] X/a for T <T C - 

This is just the qualitative behavior that we expect at a critical point. 

To find the value of M at nonzero external field, we could solve Eq. (8.1) 
with the left-hand side given by (8.2). An equivalent procedure is to minimize 
a new function, related to (8.2). Define 

G(M, H) = A(T) + B(T)M 2 + C(T)M 4 - HM. (8.6) 

Then the minimum of G(M,H ) with respect to M at fixed H gives the value 
of M that satisfies Eq. (8.1). The minimum is unique except when H = 0 and 
T < Tc, where we find the double minimum in the second line of (8.5). This 
is consistent with the phase diagram shown in Fig. 8.2. 
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Figure 8.3. Behavior of the Gibbs free energy G(M) in Landau theory, at 
temperatures above and below the critical temperature. 

To study correlations in the vicinity of the phase transition, Landau gen¬ 
eralized this description further by considering the magnetization M to be the 
integral of a local spin density: 

M = Jd 3 xs(x). (8.7) 

Then the Gibbs free energy (8.6) becomes the integral of a local function of 
s(x), 

G = jd 3 x [±(Vs) 2 + b(T - T c )s 2 + cs 4 - HsJ , (8.8) 

which must be minimized with respect to the field configuration s(x). The 
first term is the simplest possible way to introduce the tendency of nearby 
spins to align with one another. We have rescaled s(x) so that the coefficient 
of this term is set to 1/2. In writing this free energy integral, we could even 
consider H to vary as a function of position. In fact, it is useful to do that; we 
can turn on H(x) near x = 0 and see what response we find at another point. 

The minimum of the free energy expression (8.8) with respect to s(x) is 
given by the solution to the variational equation 

0 = <5G[s(x)] = — V 2 s + 2b(T - T c )s + 4cs 3 - H(x). (8.9) 

For T > Tc, where the macroscopic magnetization vanishes and so s(x) should 
be small, we can find the qualitative behavior by ignoring the s 3 term. Then 
s(x) obeys a linear equation, 

(-V 2 +2 b(T-T c ))s(x) =H(x). (8.10) 

To study correlations of spins, we will set 

H(x)=H 0 S {3) (x). 


( 8 . 11 ) 
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The resulting configuration s(x) is then the Green’s function of the differential 
operator in Eq. (8.10), so we call it D(x): 

(-V 2 + 2 b{T - T c ))D(x) = i? 0 <5 (3) (x). (8.12) 

This Green’s function tells us the response at x when the spin at x = 0 is 
forced into alignment with H. In Sections 9.2 and 9.3 we will see that T>(x) is 
also proportional to the zero-field spin-spin correlation function in the thermal 
ensemble, 

£>(x) cx (s(x)s(O)) = ^ s(x)s(0)e -H / fcT , (8.13) 

all s(x ) 


where H is the Hamiltonian of the magnetic system. 

The solution to Eq. (8.12) can be found by Fourier transformation: 


D(x) 


d 3 k 


H, 


o e 


ikx 


(2tt) 3 |k| 2 + 2b(T - T c )' 


(8.14) 


This is just the integral we encountered in our discussion of the Yukawa po¬ 
tential, Eq. (4.126). Evaluating it in the same way, we find 


D( X ) = (8.15) 

47t r 

where 

f= [26(T-T C )]“ 1/2 (8.16) 


is the correlation length , the range of correlated spin fluctuations. Notice that 
this length diverges as T —>• Tc- 

The main results of this analysis, Eqs. (8.5) and (8.16), involve unknown 
constants b, c that depend on physics at the atomic scale. On the other hand, 
the power-law dependence in these formulae on (T — Tc) follows simply from 
the structure of the Landau equations and is independent of any details of 
the microscopic physics. In fact, our derivation of this dependence did not 
even use the fact that G describes a ferromagnet; we assumed only that G 
can be expanded in powers of an order parameter and that G respects the 
reflection symmetry M —)• —M. These assumptions apply equally well to 
many other types of systems: binary alloys, superfluids, and even (though the 
reflection symmetry is less obvious here) the liquid-gas transition. Landau 
theory predicts that, near the critical point, these systems show a universal 
behavior in the dependence of M, £, and other thermodynamic quantities on 
(T — Tc). 


Critical Exponents 

The preceding treatment of the Landau theory of phase transitions emphasizes 
its similarity to classical field theory. We set up an appropriate free energy and 
found the thermodynamically preferred configuration by solving a classical 
variational equation. This gives only an approximation to the full statistical 
problem, analogous to the approximation of replacing quantum by classical 
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dynamics in field theory. In Chapter 13, we will use methods of quantum field 
theory to account properly for the fluctuations about the preferred Landau 
thermodynamic state. These modifications turn out to be profound, and rather 
counterintuitive. 

To describe the form of these modifications, let us write Eq. (8.15) more 
generally as 

(s(x)s(O)) = A-^f{r/£,), (8.17) 

where .4 is a constant and f(y ) is a function that satisfies /(0) = 1 and 
f(y) —» 0 as y —> oo. Landau theory predicts that = 0 and f(y) is a sim¬ 
ple exponential. This expression has a form strongly analogous to that of a 
Green’s function in quantum field theory. The constant .4 can be absorbed into 
the field-strength renormalization of the field s(x). The correlation length £ is, 
in general, a complicated function of the atomic parameters, but in the contin¬ 
uum description we can simply trade these parameters for £. It is appropriate 
to consider £ as a cutoff-independent, physical parameter, since it controls the 
large-distance behavior of a physical correlation. In fact, the analogy between 
Eq. (8.15) and the Yukawa potential suggests that we should identify £ _1 with 
the physical mass in the associated quantum field theory. Then Eq. (8.17) gives 
a cutoff-independent, continuum representation of the statistical system. 

If we were working in quantum field theory, we would derive corrections 
to Eq. (8.17) as a perturbation series in the parameter c multiplying the 
nonlinear term in (8.9). This would generalize the Landau result to 

(s(x)s(0)) = -E(r/£,c). (8.18) 

r 

The perturbative corrections would depend on the properties of the contin¬ 
uum field theory. For example, F(y, c ) would depend on the number of com¬ 
ponents of the field s(x), and its series expansion would differ depending on 
whether the magnetization formed along a preferred axis, in a preferred plane, 
or isotropically. For order parameters with many components, the expansion 
would also depend on higher discrete symmetries of the problem. However, we 
expect that systems described by the same Landau free energy (for example, 
a single-axis ferromagnet and a liquid-gas system) should have the same per¬ 
turbation expansion when this expansion is written in terms of the physical 
mass and coupling. The complete universality of Landau theory then becomes 
a more limited concept, in which systems have the same large-distance cor¬ 
relations if their order parameters have the same symmetry. We might say 
that statistical systems divide into distinct universality classes , each with a 
characteristic large-scale behavior. 

If this were the true behavior of systems near second-order phase transi¬ 
tions, it would already be a wonderful confirmation of the ideas required to 
formulate cutoff-independent quantum field theories. However, the true be¬ 
havior of statistical systems is still another level more subtle. What one finds 
experimentally is a dependence of the form of Eq. (8.17), where the function 
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F(y) is the same within each universality class. There is no need for an auxil¬ 
iary parameter c. On the other hand, the exponent p takes a specific nonzero 
value in each universality class. Other power-law relations of Landau theory 
are also modified, in a specific manner for each universality class. For example, 
Eq. (8.5) is changed, for T <Tc, to 

M oc (T c - T)®, (8.19) 

where the exponent 3 takes a fixed value for all systems in a given universality 
class. For three-dimensional single-axis magnets and for fluids, 3 = 0.313. The 
powers in these nontrivial scaling relations are called critical exponents. 

The modification from Eq. (8.18) to Eq. (8.17) does not imperil the idea 
that a condensed matter system, in the vicinity of a second-order phase tran¬ 
sition, has a well-defined, cutoff-independent, continuum behavior. However, 
we would like to understand why Eq. (8.17) should be expected as the cor¬ 
rect representation. The answer to this question will come from a thorough 
analysis of the ultraviolet divergences of the corresponding quantum field the¬ 
ory. In Chapter 12, when we finally conclude our explication of the ultraviolet 
divergences, we will find that we have in hand the tools not only to justify 
Eq. (8.17), but also to calculate the values of the critical exponents using 
Feynman diagrams. In this way, we will uncover a beautiful application of 
quantum field theory to the domain of atomic physics. The success of this ap¬ 
plication will guide us, in Part III, to even more powerful tools, which we will 
need in the relativistic domain of elementary particles. 



Chapter 9 


Functional Methods 


Feynman once said that* “every theoretical physicist who is any good knows 
six or seven different theoretical representations for exactly the same physics.” 
Following his advice, we introduce in this chapter an alternative method of de¬ 
riving the Feynman rules for an interacting quantum field theory: the method 
of functional integration. 

Aside from Feynman’s general principle, we have several specific reasons 
for introducing this formalism. It will provide us with a relatively easy deriva¬ 
tion of our expression for the photon propagator, completing the proof of the 
Feynman rules for QED given in Section 4.8. The functional method gener¬ 
alizes more readily to other interacting theories, such as scalar QED (Prob¬ 
lem 9.1), and especially the non-Abelian gauge theories (Part III). Since it 
uses the Lagrangian, rather than the Hamiltonian, as its fundamental quan¬ 
tity, the functional formalism explicitly preserves all symmetries of a theory. 
Finally, the functional approach reveals the close analogy between quantum 
field theory and statistical mechanics. Exploiting this analogy, we will turn 
Feynman’s advice upside down and apply the same theoretical representation 
to two completely different areas of physics. 

9.1 Path Integrals in Quantum Mechanics 

We begin by applying the functional integral (or path integral) method to 
the simplest imaginable system: a nonrelativistic quantum-mechanical particle 
moving in one dimension. The Hamiltonian for this system is 

« = £+'/<*). 

Suppose that we wish to compute the amplitude for this particle to travel 
from one point ( x a ) to another (x b ) in a given time (T). We will call this 
amplitude U(x a , x b \ T); it is the position representation of the Schrodinger 
time-evolution operator. In the canonical Hamiltonian formalism, U is given 

by 

U(x a ,x b ;T) = (x b | e~ ,HT/n \x a ). (9.1) 


*The Character of Physical Law (MIT Press, 1965), p. 168. 
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(For the next few pages we will display all factors of fi explicitly.) 

In the path-integral formalism, U is given by a very different-looking 
expression. We will first try to motivate that expression, then prove that it is 
equivalent to (9.1). 

Recall that in quantum mechanics there is a superposition principle: When 
a process can take place in more than one way, its total amplitude is the 
coherent sum of the amplitudes for each way. A simple but nontrivial example 
is the famous double-slit experiment, shown in Fig. 9.1. The total amplitude 
for an electron to arrive at the detector is the sum of the amplitudes for 
the two paths shown. Since the paths differ in length, these two amplitudes 
generally differ, causing interference. 

For a general system, we might therefore write the total amplitude for 
traveling from x a to x b as 

U(x a ,x b ; T) = e i ' (phase) = J Vx(t) e Hphase) . (9.2) 

all paths 

To be democratic, we have written the amplitude for each particular path as 
a pure phase, so that no path is inherently more important than any other. 
The symbol / Vx(t) is simply another way of writing “sum over all paths”; 
since there is one path for every function x(t) that begins at x a and ends at 
x b , the sum is actually an integral over this continuous space of functions. 

We can define this integral as part of a natural generalization of the 
calculus to spaces of functions. A function that maps functions to numbers is 
called a functional. The integrand in (9.2) is a functional, since it associates 
a complex amplitude with any function x(t). The argument of a functional 
F[x(t)] is conventionally written in square brackets rather than parentheses. 
Just as an ordinary function y(x) can be integrated over a set of points x, a 
functional F[x{t )] can be integrated over a set of functions x(t); the measure 
of such a functional integral is conventionally written with a script capital V, 
as in (9.2). A functional can also be differentiated with respect to its argument 
(a function), and this functional derivative is denoted by SF/Sx(t). We will 
develop more precise definitions of this new integral and derivative in the 
course of this section and the next. 

What should we use for the “phase” in Eq. (9.2)? In the classical limit, 
we should find that only one path, the classical path, contributes to the to¬ 
tal amplitude. We might therefore hope to evaluate the integral in (9.2) by 
the method of stationary phase, identifying the classical path x c i(t) by the 
stationary condition, 

Sw( pl “eW , )])l,, 1 = a 

But the classical path is the one that satisfies the principle of least action, 
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Figure 9.1. The double-slit experiment. Path 2 is longer than path 1 by an 
amount d, and therefore has a phase that is larger by 2nd/\, where A = 2vr %/p 
is the particle’s de Broglie wavelength. Constructive interference occurs when 
d = 0, A, ..., while destructive interference occurs when d = A/2, 3A/2, .... 

where S = f Ldt is the classical action. It is tempting, therefore, to identify 
the phase with S, up to a constant. Since the stationary-phase approximation 
should be valid in the classical limit—that is, when S '» fi —we will use S/Ti 
for the phase. Our final formula for the propagation amplitude is thus 

<*ij e~ iHT ' h \x a ) = U(x a ,x b] T) = JVx(t) (9.3) 

We can easily verify that this formula gives the correct interference pattern 
in the double-slit experiment. The action for either path shown in Fig. 9.1 is 
just (1/2 )mv 2 t, the kinetic energy times the time. For path 1 the velocity is 
v\ = D/t , so the phase is mD 2 /2Tit. For path 2 we have v% = {D+d)/t , so the 
phase is m(D+d) 2 /2ht. We must assume that d <C D, so that v% ~ Vi (Let, 
the electrons have a well-defined velocity). The excess phase for path 2 is then 
mDd/Tit ss pd/U, where p is the momentum. This is exactly what we would 
expect from the de Broglie relation p = h/X, so we must be doing something 
right. 

To evaluate the functional integral more generally, we must define the 
symbol J Vx(t) in the case where the number of paths x(t) is more than two 
(and, in fact, continuously infinite). We will use a brute-force definition, by 
discretization. Break up the time interval from 0 to T into many small pieces 
of duration e, as shown in Fig. 9.2. Approximate a path x(t) as a sequence of 
straight lines, one in each time slice. The action for this discretized path is 



Vi*)) -► E 

k 


TO ( Xk+l-Xk ) 2 

2 e 


- eV 


^+ 1 +-^ 
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Figure 9.2. We define the path integral by dividing the time interval into 
small slices of duration e, then integrating over the coordinate x/, of each 
slice. 

We then define the path integral by 



1 

C(e) 


f dx i 

f dx 2 j 

/ (7(e) J 

' C(e) J 


dxjv-i 

~cW 


oo 



— CO 


where (7(e) is a constant, to be determined later. (We have included one factor 
of (7(e) for each of the N time slices, for reasons that will be clear below.) At 
the end of the calculation we take the limit e —y 0. (As in Sections 4.5 and 6.2, 
the ])J symbol is an instruction to write what follows once for each k.) 

Using (9.4) as the definition of the right-hand side of (9.3), we will now 
demonstrate the validity of (9.3) for a general one-particle potential problem. 
To do this, we will show that the left- and right-hand sides of (9.3) are obtained 
by integrating the same differential equation, with the same initial condition. 
In the process, we will determine the constant (7(e). 

To derive the differential equation satisfied by (9.4), consider the addition 
of the very last time slice in Fig. 9.2. According to (9.3) and the definition 
(9.4), we should have 


CO 

U(x a ,x b ;T) = j 

— CO 


dx' 

C{e) 


exp 


i m{x b —x') 2 
h 2e 



(x b +x' 
V 2 



x'\ T—e). 


The integral over x' is just the contribution to / Vx from the last time slice, 
while the exponential factor is the contribution to e* s / fi from that slice. All 
contributions from previous slices are contained in U(x a ,x'-,T—e). 

As we send e —> 0, the rapid oscillation of the first term in the exponen¬ 
tial constrains x' to be very close to x b . We can therefore expand the above 
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expression in powers of (x'—x b ): 


U(x a ,x h ;T) = j 7=T exp (^ (x b -x') 2 ^j [l - J^ v ( x b) + ■ ■■] 

—oo 

[ d 1 3^ "l 

X 1^1 + (x'-x b )— + ~(x'-x b ) 2 g-j H-J U(x a ,x b ;T-e). 

(9.5) 

We can now perforin the x' integral by treating the exponential factor as a 
Gaussian. (Properly, we should introduce a small real term in the exponent for 
convergence; we will ignore this term until the next section, when we derive 
Feynman rules using functional methods.) Recall the Gaussian integration 
formulae 

{«?** = *& 

Applying these identities to (9.5), we find 

C(,„,x,,T) = (£,/5g) [l - |m) + + 0(r)]c,,„, I „.T- e) . 

This expression makes no sense in the limit e —> 0 unless the factor in paren¬ 
theses is equal to 1. We can therefore identify the correct definition of C: 

. l'2Trhe 

C(e) = \—- 9.6 

V —im 

Given this definition, we can compare terms of order e and multiply by iTi to 
obtain 

= (97) 

= HU(x a ,x b ,T). 


This is the Schrodinger equation. But it is easy to show that the time-evolution 
operator U, as originally defined in (9.1), satisfies the same equation. 

As T —y 0, the left-hand side of (9.3) tends to S(x a — x b ). Compare this 
to the value of (9.4) in the case of one time slice: 


1 r i m(x b — x a ) 2 
C^) 6X %-2e- 



This is just the peaked exponential of (9.5), and it also tends to S(x a — x b ) as 
e —> 0. Thus the left- and right-hand sides of (9.3) satisfy the same differential 
equation with the same initial condition. We conclude that the Hamiltonian 
definition of the time evolution operator (9.1) and the path-integral definition 
(9.3) are equivalent, at least for the case of this simple one-dimensional system. 

To conclude this section, let us generalize our path-integral formula to 
more complicated quantum systems. Consider a very general quantum system, 
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described by an arbitrary set of coordinates q 1 , conjugate momenta p *, and 
Hamiltonian H(q,p). We will give a direct proof of the path-integral formula 
for transition amplitudes in this system. 

The transition amplitude that we would like to compute is 

U(q a ,q b ;T) = (q b \e- tHT \q a ). (9.8) 

(When q or p appears without a superscript, it will denote the set of all 
coordinates {g*} or momenta { p *}. Also, for convenience, we now set h = 1.) 
To write this amplitude as a functional integral, we first break the time interval 
into N short slices of duration e. Thus we can write 


e -iHT = e -iHe e ~iHe ( ill. . . . ( ill. 


(N factors). 


The trick is to insert a complete set of intermediate states between each of 
these factors, in the form 


1 = (ll JdqCj \q k ) {Qk\ ■ 

Inserting such factors for k = 1... {N — 1), we are left with a product of 
factors of the form 

( qk+i | e~ tHe | q k ) ^ (qu+i | (l -iHe+---) \q k ). (9.9) 

To express the first and last factors in this form, we define qo = q a and 
qN = qo- 

Now we must look inside H and consider what kinds of terms it might 
contain. The simplest kind of term to evaluate would be a function only of the 
coordinates, not of the momenta. The matrix element of such a term would 
be 

(Qk+i\f{q)\qk) = f(qk) 11^(4 ~4+i)- 

i 

It will be convenient to rewrite this as 

(Qk+i\f(q)\Qk) =f( qk+ [ 2 +qk )(uJ exp [*£^(4+1-4)], 

for reasons that will soon be apparent. 

Next consider a term in the Hamiltonian that is purely a function of the 
momenta. We introduce a complete set of momentum eigenstates to obtain 

(Qk+i\f(p) |%) = (n Jf(Pk ) exp[* X>*(4+i - 4)] • 

Thus if H contains only terms of the form f(q ) and f{p), its matrix element 
can be written 

(q k+ i\H(q,p) \q k ) = (ll J H ( dk+l+Qk ^ exp [i ^ p{ (q { +1 ~ ql)\ ■ 

(9.10) 
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It would be nice if Eq. (9.10) were true even when H contains products of 
p’s and q’s. In general this formula must be false, since the order of a product 
pq matters on the left-hand side (where H is an operator) but not on the 
right-hand side (where H is just a function of the numbers p k and q k ). But 
for one specific ordering, we can preserve (9.10). For example, the combination 

(<7*+i| \{q 2 P 2 + 2 qp 2 q + p 2 q 2 ) \q k ) = (^ qk+1 ^ qk j ( q k+1 \p 2 \q k ) 

works out as desired, since the q’s appear symmetrically on the left and right 
in just the right way. When this happens, the Hamiltonian is said to be Weyl 
ordered. Any Hamiltonian can be put into Weyl order by commuting p’s and 
q’s ; in general this procedure will introduce some extra terms, and those extra 
terms must appear on the right-hand side of (9.10). 

Assuming from now on that H is Weyl ordered, our typical matrix element 
from (9.9) can be expressed as 

{qk+i\e~ tcH | q k ) = (l\ j 7 ^ exp \-ieH { ^ k+ \^ qk ,p k )] 

x exp[*£ 14 ( 4+1 - Qk)]- 

1 i J 


(We have again used the fact that e is small, writing 1 — ieH as e ~ leH .) To 
obtain U(q a ,qt,;T ), we multiply N such factors, one for each k, and integrate 
over the intermediate coordinates q k : 


U(qo,qN',T) 



x exp 




{EpKqUi q j k) <"{^-r^-i>k)) 


(9.11) 


There is one momentum integral for each k from 1 to N , and one coordinate 
integral for each k from 1 to IV — 1. This expression is therefore the discretized 
form of 


U{q a ,qb',T) 



Vq{t) Vp{t) I exp 





(9.12) 


where the functions q(t) are constrained at the endpoints, but the functions 
p(t) are not. Note that the integration measure Vq contains no peculiar con¬ 
stants, as it did in (9.4). The functional measure in (9.12) is just the product 
of the standard integral over phase space 

| r f dq‘dp l 

J 2tt?j 


at each point in time. Equation (9.12) is the most general formula for com¬ 
puting transition amplitudes via functional integrals. 
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For a nonrelativistic particle, the Hamiltonian is simply H = p 2 /2m + 
V(q). In this case we can evaluate the p-integrals by completing the square in 
the exponent: 

/ e*p[i{Pk(qk+i-qk) ~ ep*/2m)] = exp\^-{q k+1 - q k ) 2 ], 

where C(e) is just the factor (9.6). Notice that we have one such factor for each 
time slice. Thus we recover expression (9.3), in discretized form, including the 
proper factors of C: 

C(4.,« ; T) = (jMn/)|y)exp 

(9.13) 


k 


m (qk+i-qk)' 


c y ( Qk+i+Qk ^j 


V 


9.2 Functional Quantization of Scalar Fields 


In this section we will apply the functional integral formalism to the quantum 
theory of a real scalar field 4>{x), Our goal is to derive the Feynman rules for 
such a theory directly from functional integral expressions. 

The general functional integral formula (9.12) derived in the last section 
holds for any quantum system, so it should hold for a quantum field theory. 
In the case of a real scalar field, the coordinates q * are the field amplitudes 
d>(x), and the Hamiltonian is 

H = jd 3 x [|tt 2 + i(Yoy + V(cf>)\. 

Thus our formula becomes 


(4>b( : 


1 

c)| e~' HT \(f>a (x)} = j V 0 V 1 T exp i jd 4 x(^r<j)— t;TT 2 — ^(Vd») 2 — V 


where the functions <f>{x) over which we integrate are constrained to the spe¬ 
cific configurations <^ a ( x ) at x° = 0 and <pt,(x) at x° = T. Since the exponent 
is quadratic in n, we can complete the square and evaluate the Vn integral 
to obtain 




— iHT 


l^a( x )) = 


1 

j T>0 exp i J 


d 4 x L 


(9.14) 


where 

L=\{d^f-V{0) 

is the Lagrangian density. The integration measure T>0 in (9.14) again involves 
an awkward constant, which we will not write explicitly. 

The time integral in the exponent of (9.14) goes from 0 to T, as de¬ 
termined by our choice of what transition function to compute; in all other 
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respects this formula is manifestly Lorentz invariant. Any other symmetries 
that the Lagrangian may have are also explicitly preserved by the functional 
integral. As we proceed in our study of quantum field theory, symmetries and 
their associated conservation laws will play an increasingly central role. We 
therefore propose to take a rash step: Abandon the Hamiltonian formalism, 
and take Eq. (9.14) to define the Hamiltonian dynamics. Any such formula 
corresponds to some Hamiltonian; to find it, one can always differentiate with 
respect to T and derive the Schrodinger equation as in the previous section. 
We thus consider the Lagrangian £ to be the most fundamental specification 
of a quantum field theory. We will see next that one can use the functional 
integral to compute from £ directly, without invoking the Hamiltonian at all. 


Correlation Functions 


To make direct use of the functional integral, we need a functional formula 
for computing correlation functions. To find such an expression, consider the 
object 


J V<f>(x) <f>(xi)<f>(x 2 ) exp 


T 

i J d 4 x £(d>) , 

-T 


(9.15) 


where the boundary conditions on the path integral are <j>(—T,x) = <^ a (x) and 
(j>{T, x) = (pi, (x) for some (j> a , 9b ■ We would like to relate this quantity to the 
two-point correlation function, (fi| T<f>#{x\)4> H (x?) |H). (To distinguish oper¬ 
ators from ordinary numbers, we write the Heisenberg picture operator with 
an explicit subscript: <p H {x). Similarly, we will write d> s (x) for the Schrodinger 
picture operator.) 

First we break up the functional integral in (9.15) as follows: 

£><pi(x) J Z>cfc(x) J D<p(x). (9.16) 

0(' r i'jX)—0i(x) 

^(■.®S>x)=5h(x) 

The main functional integral / Vp(x) is now constrained at times x ? and ,r§ (in 
addition to the endpoints — T and T), but we must integrate separately over 
the intermediate configurations <j> i(x) and <^ 2 (x). After this decomposition, 
the extra factors 4>{x\) and fix?) in (9.15) become </>i(xi) and <p 2 (x 2 ), and 
can be taken outside the main integral. The main integral then factors into 
three pieces, each being a simple transition amplitude according to (9.14). 
The times x ° and x § automatically fall in order; for example, if x® < ,-r§, then 
(9.15) becomes 

J ^i(x) / £ , <5 2 (x) d>i(xi)<5 2 (x 2 ) {4>b\ e ~ lH(T ~ x ° ] |</> 2 ) 

x 1^). 
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We can turn the field 0i(xi) into a Schrodinger operator using 0 s (xi) \<f>\) = 
d>i(xi) \4>i)- The completeness relation f V&i \cpi) {cpi\ = 1 then allows us 
to eliminate the intermediate state \4>i)- Similar manipulations work for <j > 2 , 
yielding the expression 

m e-^ T ~^ * s (x 2 ) o s (x 1 ) v \cp a ). 


Most of the exponential factors combine with the Schrodinger operators to 
make Heisenberg operators. In the case ;r° > ;c 2 , the order of x\ and ;r 2 would 
simply be interchanged. Thus expression (9.15) is equal to 

(<p b \e~ lHT T{<f> H (x 1 )<f> H (x 2 )} e~ lHT \<p a ) . (9.17) 

This expression is almost equal to the two-point correlation function. To 
make it more nearly equal, we take the limit T —> 00 (1 — ie). Just as in 
Section 4.2, this trick projects out the vacuum state |Q) from | <f> a ) and \(j>b) 
(provided that these states have some overlap with |S1), which we assume). 
For example, decomposing \<f> a ) into eigenstates | n) of //. we have 


—iHT 


1 4a) = 


= E^ 


\n) {n\4> a ) 


T— ^oo(l —ie 


m a ) 


D -iEo-oo(l—ie) 


\n). 


As in Section 4.2, we obtain some awkward phase and overlap factors. But 
these factors cancel if we divide by the same quantity as (9.15) but without 
the two extra fields <p(xi) and <p(:c 2 ). Thus we obtain the simple formula 


{9,\T(jj H {xi)4> H {x-2) |H) 


lim 

T—> 00 ( 1 -ie) 


J V(f> (j>(xi)(f>(x 2 ) exp iJ^ T d 4 x C 
J Dtp exp i ff T d 4 x C 


(9.18) 

This is our desired formula for the two-point correlation function in terms 
of functional integrals. For higher correlation functions, just insert additional 
factors of (j) on both sides. 


Feynman Rules 

Our next task is to compute various correlation functions directly from the 
right-hand side of formula (9.18). In other words, we will now use (9.18) to 
derive the Feynman rules for a scalar field theory. We will begin by computing 
the two-point function in the free Klein-Gordon theory, then generalize to 
higher correlation functions in the free theory. Finally, we will consider <j> 4 
theory, in which we can perform a perturbation expansion to obtain the same 
Feynman rules as in Section 4.4. 

Consider first a noninteracting real-valued scalar field: 

S 0 = Jd 4 xC 0 = Jd 4 x [i(<9 M ()>) 2 - 4m 2 0 2 ] • 


(9.19) 
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Since Cq is quadratic in <f>, the functional integrals in (9.18) take the form of 
generalized, infinite-dimensional Gaussian integrals. We will therefore be able 
to evaluate the functional integrals exactly. 

Since this is our first functional integral computation, we will do it in a 
very explicit, but ugly, way. We must first define the integral V<p over field 
configurations. To do this, we use the method of Eq. (9.4) in considering the 
continuous integral as a limit of a large but finite number of integrals. We 
thus replace the variables <p(x) defined on a continuum of points by variables 
4>{%i) defined at the points Xj of a square lattice. Let the lattice spacing be e, 
let the four-dimensional spacetime volume be L 4 , and define 

V(f> = l\d4>(xi), (9.20) 

i 

up to an irrelevant overall constant. 

The field values (j>(xj) can be represented by a discrete Fourier series: 

<K*.0 = |r£e^"''W n ). (9.21.) 

n 

where k£ = 2 tt n^/L, with n' J an integer, \k^\ < 7r/e, and V = L 4 . The 
Fourier coefficients 4>{k) are complex. However, <f>(x) is real, and so these 
coefficients must obey the constraint <f>*(k) = <f>{—k). We will consider the 
real and imaginary parts of the (j>{k n ) with > 0 as independent variables. 
The change of variables from the (p(xj) to these new variables (j>{k n ) is a 
unitary transformation, so we can rewrite the integral as 

V<f>(x) = dRe <f>(k n ) dim <p(k n ). 
k° n >o 

Later, we will take the limit L —> oo, e —> 0. The effect of this limit is to 
convert discrete, finite sums over k n to continuous integrals over k: 



In the following discussion, this limit will produce Feynman perturbation the¬ 
ory in the form derived in Part I. We will not eliminate the infrared and 
ultraviolet divergences of Feynman diagrams that we encountered in Chap¬ 
ter 6, but at least the functional integral introduces no new types of singular 
behavior. 

Having defined the measure of integration, we now compute the functional 
integral over <j>. The action (9.19) can be rewritten in terms of the Fourier 
coefficients as 

[d 4 x [^(S M <)) 2 - \m 2 <jr] = Y^H m 2 ~ k n) W fc «)| 2 

n 

= --7 H ( m 2 -kl ) [(Re <p n )' 2 + (lino,,) 3 ) 

V k°>o 
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where we have abbreviated o(k„ I as <p n in the second line. The quantity 
(m 2 — A; 2 ) = (m 2 + |k „| 2 — fc° 2 ) is positive as long as is not too large. 
In the following discussion, we will treat this quantity as if it were positive. 
More precisely, we evaluate it by analytic continuation from the region where 

k, >C 

The denominator of formula (9.18) now takes the form of a product of 
Gaussian integrals: 


/ 


Voi f JJ <1 lieo„ r/ Ini o n j exp [ ~ ^2 (m 2 -fc 2 )|^„| 2 j 

= n (/ dRe( t>n exp [ — ^7 ( TO 2 — ^ n ) (Re <Pn ) " 

X f fcl Im o n expA^)(Im <p n ) 2 


k° n > 0 


n 

k° >0 

n 

all k n 


—mV / —mV 


i 2 —k 2 \ m 2 —k 2 


—mV 

m 2 —k'i 


(9.23) 


To justify using Gaussian integration formulae when the exponent appears to 
be purely imaginary, recall that the time integral in (9.18) is along a contour 
that is rotated clockwise in the complex plane: t —> t (1 — ie). This means that 
we should change k° -O k°( 1 + ie) in (9.21) and all subsequent equations; in 
particular, we should replace ( k 2 — m 2 ) —» ( k 2 — m 2 Tie). The ie term gives 
the necessary convergence factor for the Gaussian integrals. It also defines 
the direction of the analytic continuation that might be needed to define the 
square roots in (9.23). 

To understand the result of (9.23), consider as an analogy the general 
Gaussian integral 

II / d &) exp | 
k J ‘ 

where B is a symmetric matrix with eigenvalues bj. To evaluate this integral 
we write = Gwhere O is the orthogonal matrix of eigenvectors that 
diagonalizes B. Changing variables from £, to the coefficients x -,, we have 

(II j d ex P [-CiBijtj] = jdx^j exp [- ^ M 2 ] 

= Y\.( f dXi exp [-bi,x“] 
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=n 

i 

= const x [del B] 1 ^“. (9.24) 

The analogy is clearer if we perforin an integration by parts to write the 
Klein-Gordon action as 

So = \ J d A x0(—d 2 —m 2 )<p + (surface term). 

Thus the matrix B corresponds to the operator (m 2 +d 2 ), and we can formally 
write our result as 

J V<pe iSo = const x [det(m 2 +<9 2 )]“ 1/2 . (9.25) 

This object is called a functional determinant. The actual result (9.23) looks 
quite ill-defined, and in fact all of these factors will cancel in Eq. (9.18). How¬ 
ever, in many circumstances, the functional determinant itself has physical 
meaning. We will see examples of this in Sections 9.5 and 11.4. 

Now consider the numerator of formula (9.18). We need to Fourier-expand 
the two extra factors of 0: 

o{x i )o(./'-j I yY y Y e- ikl **<h. 

m l 

Thus the numerator is 

l_ Y<T iik *^ 1+krX2) ( J] [dRe<p n dIm<p n ) (9.26) 

m.l \\k°>0 J ' 

x (Re 4> m + i Im <p m ) (R e<pi +i Im <p ,) 
xexp[— Y {m 2 -k 2 )[{Re(p n ) 2 + (Imd»„) 2 ]j. 

n\k o >0 



For most values of k m and ki this expression is zero, since the extra factors of 
<j> make the integrand odd. The situation is more complicated when k m = ±fc/. 
Suppose, for example, that k° n > 0. Then if ki = +k m , the term involving 
(Re^, ?l ) 2 is nonzero, but is exactly canceled by the term involving (Im<f> m ) 2 . 
If ki = —k m , however, the relation <p(—k) = <j>*(k) gives an extra minus sign 
on the (Im</> m ) 2 term, so the two terms add. When kff n < 0 we obtain the 
same expression, so the numerator is 


Numerator = —- > e 
V 2 ^ 


\ ' p -ik m -(.V1-X2 


n 

k°„> 0 


-iV 


—mV 


m 2 —k 2 J m 2 — k' 2 — ie 


The factor in parentheses is identical to the denominator (9.23), while the rest 
of this expression is the discretized form of the Feynman propagator. Taking 
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the continuum limit (9.22), we find 

(0| T<p(xi)<f>(x-2) |0) = j (^4 _ m 2 + ie = Df(xi-x 2 ). (9.27) 

This is exactly right, including the -Me. 

Next we would like to compute higher correlation functions in the free 
Klein-Gordon theory. 

Inserting an extra factor of <j> in (9.18), we see that the three-point function 
vanishes, since the integrand of the numerator is odd. All other odd correlation 
functions vanish for the same reason. 

The four-point function has four factors of cp in the numerator. Fourier- 
expanding the fields, we obtain an expression similar to Eq. (9.26), but with 
a quadruple sum over indices that we will call m, l, p , and q. The integrand 
contains the product 

(Re <f> m + i Im <f> m )(Re 4>i + i Im <p/)(Re 4>p + i Im <j> p )( Re <p q + i Im <f> q ). 


Again, most of the terms vanish because the integrand is odd. One of the 
nonvanishing terms occurs when ki = —k m and k q = —k p . After the Gaussian 
integrations, this term of the numerator is 



The factor in parentheses is again canceled by the denominator. We obtain 
similar terms for each of the other two ways of grouping the four momenta 
in pairs. To keep track of the groupings, let us define the contraction of two 
fields as 

1 71, ^ f V<f)e iSo <f)(xi)(f>(x 2 ) n , , notA 

)<j>{x 2 ) = - j eiSo - = Dp(x i - ,t 2 ). (9.28) 

Then the four-point function is simply 

(0| T^j^ 2 ^ 3 ^ 4 |0) = sum of all full contractions 

= D f (x i - x 2 )D f (x 3 - x,i) 

-I- D F (x i - x 3 )D F (x 2 - x 4 ) (9.29) 

+ D F (x 1 - X 4 )D F (X 2 - x 3 ), 

the same expression that we obtained using Wick’s theorem in Eq. (4.40). 

The same method allows us to compute still higher correlation functions. 
In each case the answer is just the sum of all possible full contractions of 
the fields. This result, identical to that obtained from Wick’s theorem in 
Section 4.3, arises here from the simple rules of Gaussian integration. 
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We are now ready to move from the free Klein-Gordon theory to <p 4 theory. 
Add to Cq a ( ft 4 interaction: 

A 


c = c 0 --^. 


Assuming that A is small, we can expand 


exp 


•/ 


i / dr x C 


= exp 


•/ 


i / drxCo 


1 “ i J dlx i! 


d- 4 + 


Making this expansion in both the numerator and the denominator of (9.18), 
we see that each is (aside from the constant factor (9.23), which again cancels) 
expressed entirely in terms of free-held correlation functions. Moreover, since 
if cl* x Ci n t = we obtain exactly the same expansion as in Eq. (4.31). 

We can express both the numerator and the denominator in terms of Feynman 
diagrams, with the fundamental interaction again given by the vertex 


= -i\(2ir) 4 6^(J2p). (9.30) 


All of the combinatorics work the same as in Section 4.4. In particular, the 
disconnected vacuum bubble diagrams exponentiate and factor from the nu¬ 
merator of (9.18), and are canceled by the denominator, just as in Eq. (4.31). 

The vertex rule for <p 4 theory follows from the Lagrangian in an exceed¬ 
ingly simple way, and this simple procedure will turn out to be valid for other 
quantum held theories as well. Once the quadratic terms in the Lagrangian 
are properly understood and the propagators of the theory are computed, the 
vertices can be read directly from the Lagrangian as the coefficients of the 
cubic and higher-order terms. 


Functional Derivatives and the Generating Functional 


To conclude this section, we will now introduce a slicker, more formal, method 
for computing correlation functions. This method, based on an object called 
the generating functional, avoids the awkward Fourier expansions of the pre¬ 
ceding derivation. 

First we dehne the functional derivative, 6/6J(x), as follows. The func¬ 
tional derivative obeys the basic axiom (in four dimensions) 

J( y ) = S (4) (x - y) or JJfff- jd 4 y J(y)<f>(y ) = <f>(x). (9.31) 

This dehnition is the natural generalization, to continuous functions, of the 
rule for discrete vectors, 


OX; Xj ~ ^ 


_d_ 

dxi 


E 


Xjkj — k{ . 


or 
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To take functional derivatives of more complicated functionals we simply use 
the ordinary rules for derivatives of composite functions. For example, 

^-y exp[-i j d 4 y J(y)<f>(y)^ = i<j>(x) exp [i j d 4 yJ{y)<p{y) j. (9.32) 

When the functional depends on the derivative of J, we integrate by parts 
before applying the functional derivative: 

J d 4 y dJ(y)\ ni (y) = -d,V»(x). (9.33) 

The basic object of this formalism is the generating functional of corre¬ 
lation functions, Z[J ]. (Some authors call it W[J].) In a scalar field theory, 
Z[J] is defined as 

Z[J\ = J £>()>exp| ij d 4 x\C + J(x)<t>(x)\ j. (9.34) 

This is a functional integral over <p in which we have added to £ in the expo¬ 
nent a source term , J(x)<p(x). 

Correlation functions of the Klein-Gordon field theory can be simply com¬ 
puted by taking functional derivatives of the generating functional. For exam¬ 
ple, the two-point function is 

(opx*, )«,,*,) |o) = ms) 

where Z 0 = Z[.J = 0]. Each functional derivative brings down a factor of <f> in 
the numerator of Z[.J] ; setting J = 0, we recover expression (9.18). To compute 
higher correlation functions we simply take more functional derivatives. 

Formula (9.35) is useful because, in a free field theory, Z[.J] can be rewrit¬ 
ten in a very explicit form. Consider the exponent of (9.34) in the free Klein- 
Gordon theory. Integrating by parts, we obtain 

j d 4 x[Co((j>) + J(f\ = j d 4 x\^(f>{— d 2 — m 2 + ie)<f> + J(f>] . (9.36) 

(The ie is a convergence factor for the functional integral, as we discussed 
below Eq. (9.23).) We can complete the square by introducing a shifted field, 

'•''(.'•I = <?(x) -I J d 4 y D F (x—y)J{y). 

Making this substitution and using the fact that Dp is a Green’s function of 
the Klein-Gordon operator, we find that (9.36) becomes 

J d 4 x [£o(d>) + J(f>] = J d 4 x [^<f>'(— d 2 — m 2 

- J d 4 x d 4 y ^J(x)(—iD F )(x l y)J(y). 
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More symbolically, we could write the change of variables as 

4> = q> + (- d' 2 -m 2 + ie )- 1 J, (9.37) 

and the result 

j d A x[Co(<f>) + J<j>\ = j d 4 x\^<f)'{ — d 2 —m 2 + — ^J(—d 2 —m 2 +ie)~ 1 j]. 

(9.38) 

Now change variables from <f> to <j>' in the functional integral of (9.34). 
This is just a shift, and so the Jacobian of the transformation is 1. The result 
is 

j V<t> expj^i j d 4 x £o(<?^)] expj ^—i J d 4 x d 4, y ^J(x)[—iD F (x—y)\J(y)^. 

The second exponential factor is independent of <j>', while the remaining inte¬ 
gral over <t )' is precisely Z 0 . Thus the generating functional of the free Klein- 
Gordon theory is simply 

Z[J] = Z 0 expj^— i j d 4 xd 4 y J(x)Dp(x—y)J(y) j. (9.39) 

Let us use Eqs. (9.39) and (9.35) to compute some correlation functions. 
The two-point function is 


(0 /o(.ri )o(x,) |0) 

<5 <5 


SJ(x i) 5J(x o) 
<5 


exp[-| j d 4 xd 4 y J{x)D F {x—y)J{y) j 


j =o 


SJ(x i) 

= D f (x i - x 2 ). 


Jd 4 y D F (x -2 —y) J(y) - \ jd 4 x J{x)D F {x-x 2 ) 


Z[J] 


j =o 
(9.40) 


Taking one derivative brings down two identical terms; the second derivative 
gives several terms, but only when it acts on the outside factor do we get a 
term that survives when we set J = 0. 

It is instructive to work out the four-point function by this method as 
well. In order to fit the computation in a reasonable amount of space, let 
us abbreviate arguments of functions as subscripts: <f>\ = (j>{x i), J x = J(x), 
D x4 = D f (x—X4), and so on. Repeated subscripts will be integrated over 
implicitly. The four-point function is then 

(OlTfcfcfofa | 0 ) = ^^-±-[-J x D xA ]e-^ D ^ 

OJ i OJ 2 OJ 3 J =0 

— T~TY7~ \~^ 34 ^xD X 4JyD y s\e 2 JxDx y J y 
oJ\ 0J2 L J j =0 

= -JJ- \jD 34 J X D X 2 + D-24JyDy3 + J x D xA Do 3 ^ e 2 J z D zy J y ^ ^ 

= D34D12 + D04D13 + D14D03 , ( 9 - 41 ) 
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in agreement with (9.29). The rules for differentiating the exponential give 
rise to the same familiar pattern: We get one term for each possible way of 
contracting the four points in pairs, with a factor of Dp for each contraction. 

The generating functional method used just above to construct the cor¬ 
relations of a free field theory can be used as well to represent the correla¬ 
tion functions of an interacting field theory. Formula (9.35) is independent of 
whether the theory is free or interacting. The factor Z[J = 0] is nontrivial in 
the case of an interacting field theory, but it simply gives the denominator of 
Eq. (9.18), that is, the sum of vacuum diagrams. Again from this approach, 
the combinatoric issues in the evaluation of correlation functions are the same 
as in Section 4.4. 


9.3 The Analogy Between Quantum Field Theory 
and Statistical Mechanics 

Let us now pause from the technical aspects of this discussion to consider some 
implications of the formulae we have derived. To begin, let us summarize the 
formal conclusions of the previous section in the following way: For a field 
theory governed by the Lagrangian C. the generating functional of correlation 
functions is 

Z[J) = / V<p exp [ i Jd 4 x {C + J0)]. (9.42) 

The time variable of integration in the exponent runs from — T to T, with 
T —y oo(l — it). A correlation function such as (9.18) is reproduced by writing 


<01 m Xl )cp(x 2 ) |0) = Z[J]~ l 



(9.43) 


The generating functional (9.42) is reminiscent of the partition function of 
statistical mechanics. It has the same general structure of an integral over all 
possible configurations of an exponential statistical weight. The source J(x) 
plays the role of an external field. In fact, our method of computing correlation 
functions by differentiating with respect to J(x) mimics the trick often used 
in statistical mechanics of computing correlation functions by differentiating 
with respect to such variables as the pressure or the magnetic field. 

This analogy can be made more precise by manipulating the time vari¬ 
able of integration in (9.42). The derivation of the functional integral formula 
implied that the time integration was slightly tipped into the complex plane, 
in just the direction to permit the contour to be rotated clockwise onto the 
imaginary axis. We have already noted (below (9.23)) that the original in¬ 
finitesimal rotation gives the correct ie prescription to produce the Feynman 
propagator. The finite rotation is the analogue in configuration space of the 
Wick rotation of the time component of momentum illustrated in Fig. 6.1. 
Like the Wick rotation in a momentum integral, this Wick rotation of the 
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time coordinate t —> —ix° produces a Euclidean 4-vector product: 


^ =i 2_| x |2^_ ( ^0 ) 2_| x |2 = _| a . s ^ 


(9.44) 


It is possible to show, by manipulating the expression for each Feynman di¬ 
agram, that the analytic continuation of the time variables in any Green’s 
function of a quantum field theory produces a correlation function invari¬ 
ant under the rotational symmetry of four-dimensional Euclidean space. This 
Wick rotation inside the functional integral demonstrates this same conclusion 
in a more general way. 

To understand what we have achieved by this rotation, consider the ex¬ 
ample of ( ft 4 theory. The action of ( j > 4 theory coupled to sources is 

J d 4 x (£ + J<f>) = j d 4 x i m 2 <f> 2 - <p 4 + J<pj . (9.45) 

After the Wick rotation (9.44), this expression takes the form 

i Jd 4 x E (C E ~ J<t>) = i Jd 4 x E \^{d Ep <j)) 2 + ^m 2< t> 2 + ^4 > 4 ~ J<f>\ • (9.46) 

This expression is identical in form to the expression (8.8) for the Gibbs free 
energy of a ferromagnet in the Landau theory. The field 4>{x E ) plays the role 
of the fluctuating spin field s(x), and the source J(x) plays the role of an 
external magnetic field. Note that the new ferromagnet lives in four, rather 
than three, spatial dimensions. 

The Wick-rotated generating functional Z[J] becomes 

Z[J\ = j V<p exp [- f d 4 x E (C E - J0)]. (9.47) 


The functional C E [<p\ has the form of an energy: It is bounded from below 
and becomes large when the field (p has large amplitude or large gradients. 
The exponential, then, is a reasonable statistical weight for the fluctuations 
of <p. In this new form, Z[J] is precisely the partition function describing the 
statistical mechanics of a macroscopic system, described approximately by 
treating the fluctuating variable as a continuum field. 

The Green’s functions of <p(x E ) after Wick rotation can be calculated 
from the functional integral (9.47) exactly as we computed Minkowski Green’s 
functions in the previous section. For the free theory (A = 0), a set of manipu¬ 
lations analogous to those that produced (9.27) or (9.40) gives the correlation 
function of cp as 


{<t>(x E i)<t>{x E - 2 )) = j 


cftk E gikE-(XEl—XE2) 

(27t) 4 kg + m 2 


(9.48) 


This is just the Feynman propagator evaluated in the spacelike region; accord¬ 
ing to Eq. (2.52), this function falls off as exp(— m\x E \ — x E2 \)- That behavior 
is the four-dimensional analogue of the spin correlation function (8.15). We 
see that, in the Euclidean continuation of field theory Green’s functions, the 
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Compton wavelength m -1 of the quanta becomes the correlation length of 
statistical fluctuations. 

This correspondence between quantum field theory and statistical me¬ 
chanics will play an important role in the developments of the next few chap¬ 
ters. In essence, it adds to our reserves of knowledge a completely new source 
of intuition about how field theory expectation values should behave. This 
intuition will be useful in imagining the general properties of loop diagrams 
and, as we have already discussed in Chapter 8, it will give important insights 
that will help us correctly understand the role of ultraviolet divergences in 
field theory calculations. In Chapter 13, we will see that field theory can also 
contribute to statistical mechanics by making profound predictions about the 
behavior of thermal systems from the properties of Feynman diagrams. 


9.4 Quantization of the Electromagnetic Field 


In Section 4.8 we stated without proof the Feynman rule for the photon prop¬ 
agator, 


W fn* 

k 2 + it 


(9.49) 


Now that we have the functional integral quantization method at our com¬ 
mand, let us apply it to the derivation of this expression. 

Consider the functional integral 

J VAe iS[A \ (9.50) 

where 5[4] is the action for the free electromagnetic field. (The functional 
integral is over each of the four components: 'DA = VA°VA 1 VA 2 VA 3 .) Inte¬ 
grating by parts and expanding the field as a Fourier integral, we can write 
the action as 

s = J Ax | {(/>,)-; 

= lj d 4 xA t A r)(3V" - Wd^Arix) 

= lJ-^lMk)(-k 2 g^ +k^)AA-k). (9.51) 

This expression vanishes when A^(k) = k fl a(k), for any scalar function a(k). 
For this large set of field configurations the integrand of (9.50) is 1, and there¬ 
fore the functional integral is badly divergent (there is no Gaussian damping). 
Equivalently, the equation 

{d 2 g„M ~ d lx d„)Dp P (x - y) = i6/ 6 {i) {x - y) 

{-k 2 g ltv +k lt k v )D v F p (k) = i8*, 


or 


(9.52) 
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which would define the Feynman propagator D V F P , has no solution, since the 
4x4 matrix (—k 2 g lt „ + k, l k l ,) is singular. 

This difficulty is due to gauge invariance. Recall that F^, and hence £, 
is invariant under a general gauge transformation of the form 

A tl (x) -> A fl (x ) + ^a(x). 

The troublesome modes are those for which A F (*) = i d M a(x), that is, those 
that are gauge-equivalent to A tJ (x) = 0. The functional integral is badly de¬ 
fined because we are redundantly integrating over a continuous infinity of 
physically equivalent field configurations. To fix the problem, we would like 
to isolate the interesting part of the functional integral, which counts each 
physical configuration only once. 

We can accomplish this by means of a trick, due to Faddeev and Popov.t 
Let G(A) be some function that we wish to set equal to zero as a gauge¬ 
fixing condition; for example, G(A) = d^A^ corresponds to Lorentz gauge. 
We could constrain the functional integral to cover only the configurations 
with G(A) = 0 by inserting a functional delta function, <5(G(A)). (Think of 
this object as an infinite product of delta functions, one for each point x.) To 
do so legally, we insert 1 under the integral of (9.50), in the following form: 

1 = j Va(x) S(G(A a )) det(^^), (9.53) 

where A a denotes the gauge-transformed field, 

A%(x) = A fl (x) + ^d^a(x). 

Equation (9.53) is the continuum generalization of the identity 

1 = (n /*,) f’liW) det(Ai) 

for discrete n-dimensional vectors. In Lorentz gauge we have G(A a ) = 
d^A,,, + (l/e)d 2 a, so the functional determinant det(<5G(A Q )/<5o:) is equal 
to det(<9 2 /e). For the present discussion, the only relevant property of this de¬ 
terminant is that it is independent of .4, so we can treat it as a constant in 
the functional integral. 

After inserting (9.53), the functional integral (9.50) becomes 

del j Va J VAe 1 * r ,)(G( .4") ) . 

Now change variables from A to A a . This is a simple shift, so VA = VA a . 
Also, by gauge invariance, S[A] = 5[A Q ], Since A a is now just a dummy 


tL. D. Faddeev and V. N. Popov, Phvs. Lett. 25B, 29 (1967). 
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integration variable, we can rename it back to A, obtaining 

j DA, 5 = det( <5G ^ Q ^ > ) J Da J DA r <s S(G(A)). (9.54) 

The functional integral over A is now restricted by the delta function to phys¬ 
ically inequivalent field configurations, as desired. The divergent integral over 
a(x) simply gives an infinite multiplicative factor. 

To go further we must specify a gauge-fixing function G{A). We choose 
the general class of functions 

G(A) =d , ‘A ll {x) — w(x), (9.55) 

where u(x) can be any scalar function. Setting this G(A) equal to zero gives 
a generalization of the Lorentz gauge condition. The functional determinant 
is the same as in Lorentz gauge, det (SG(A a )/Sa) = det(d' 2 /e). Thus the 
functional integral becomes 

J DAe ,s[A] = det(ia 2 ) Daj JVAe iS[A] <$(8%, - co(x)). 

This equality holds for any co(x), so it will also hold if we replace the right- 
hand side with any properly normalized linear combination involving different 
functions w(x). For our final trick, we will integrate over all co{x), with a 
Gaussian weighting function centered on u> = 0. The above expression is thus 
equal to 

Daj jVAe iS W <5(5%,-w(.r)) 

VAe iS W exp -i [d 4 x , 

L J 

(9.56) 

where N(£) is an unimportant normalization constant and we have used the 
delta function to perform the integral over u>. We can choose £ to be any 
finite constant. Effectively, we have added a new term —(d^A^) 2 / 2£ to the 
Lagrangian. 

So far we have worked only with the denominator of our formula for 
correlation functions, 


N(0 det j (^J Vo^j J 


A T (£) J 'Dwexp —i J d A x ^ det^-<9 2 j ^ J 


(fllTO(A)lfi) = lim 

T-^oo(l-ie) 


f VAO(A) ex p i J^ T d 4 x L 
J DA exp i ff_ T d 4 x C 


The same manipulations can also be performed on the numerator, provided 
that the operator O(A) is gauge invariant. (If it is not, the variable change 
from A to A a preceding Eq. (9.54) does not work). Assuming that O(A) is 
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gauge invariant, we find for its correlation function 

f VA O(A) exp \i d 4 x \C — 4?(d ,l Au 

(n\TO(A)\n)= iim -- r-L-— -—1- 

T ^ oo(1 "' e) / VA expj^i f_ 7 : d 4 x [£ - J|(c)ffi4 M ) 2 ] 

(9.57) 

The awkward constant factors in (9.56) have canceled; the only trace left by 
this whole process is the extra £-term that is added to the action. 

At the beginning of this section, in Eq. (9.52), we saw that we could 
not obtain a sensible photon propagator from the action 5[A]. With the new 
£-term, however, that equation becomes 

{-k 2 g tw + {l-j)k v k lJ )bp P (k) = iS,f, 
which has the solution 

^ = ^(^-< 1 - 0 ^). ( 8 . 58 ) 

This is our desired expression for the photon propagator. The ie term in the 
denominator arises exactly as in the Klein-Gordon case. Note the overall minus 
sign relative to the Klein-Gordon propagator, which was already evident in 
Eq. (9.52). 

In practice one usually chooses a specific value of £ when making compu¬ 
tations. Two choices that are often convenient are 

£ = 0 Landau gauge; 

£ = 1 Feynman gauge. 

So far in this book we have always used Feynman gauge.* 

The Faddeev-Popov procedure guarantees that the value of any correla¬ 
tion function of gauge-invariant operators computed from Feynman diagrams 
will be independent of the value of £ used in the calculation (as long as the 
same value of £ is used consistently). In the case of QED, it is not difficult 
to prove this ^-independence directly. Notice in Eq. (9.58) that £ multiplies a 
term in the photon propagator proportional to kPk 1 '. According to the Ward- 
Takahashi identity (7.68), the replacement in a Green’s function of any photon 
propagator by AffiAC yields zero, except for terms involving external off-shell 
fermions. These terms are equal and opposite for particle and antiparticle and 
vanish when the fermions are grouped into gauge-invariant combinations. 

To complete our treatment of the quantization of the electromagnetic field, 
we need one additional ingredient. In Chapters 5 and 6, we computed 5-matrix 


*Other choices of £ may be useful in specific applications; for example, in certain 
problems of bound states in QED, the Yennie gauge , £ = 3, produces a cancellation 
that is otherwise difficult to make explicit. See H. M. Fried and D. R. Yennie, Phvs. 
Rev. 112, 1391 (1958). 
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elements for QED from the correlation functions of non-gauge-invariant oper¬ 
ators ip(x), ip(x), and A fl (x). We will now argue that the 5-matrix elements 
are given correctly by this procedure. Since the 5-matrix is defined between 
asymptotic states, we can compute 5-matrix elements in a formalism in which 
the coupling constant is turned off adiabatically in the far past and far fu¬ 
ture. In the zero coupling limit, there is a clean separation between gauge- 
invariant and gauge-variant states. Single-particle states containing one elec¬ 
tron, one positron, or one transversely polarized photon are gauge-invariant, 
while states with timelike and longitudinal photon polarizations transform 
under gauge motions. We can thus define a gauge-invariant 5-matrix in the 
following way: Let 5fp be the 5-matrix between general asymptotic states, 
computed from the Faddeev-Popov procedure. This matrix is unitary but 
not gauge-invariant. Let Pq be a projection onto the subspace of the space 
of asymptotic states in which all particles are either electrons, positrons, or 
transverse photons. Then let 

5 = P 0 S FP P 0 . (9.59) 

This 5-matrix is gauge invariant by construction, because it is projected onto 
gauge-invariant states. It is not obvious that it is unitary. However, we ad¬ 
dressed this issue in Section 5.5. We showed there that any matrix element 
.'VPe* for photon emission satisfies 

5Z (9.60) 

> 1,2 

where the sum on the left-hand side runs only over transverse polarizations. 
The same argument applies if yVF and M* v are distinct amplitudes, as long 
as they satisfy the Ward identity. This is exactly the information we need to 
see that 

55+ = P 0 S FP P 0 Sl P P 0 = P 0 5 fp 5p P P 0 . (9.61) 

Now we can use the unitarity of 5 F p to see that 5 is unitary, 55+ = 1, on 
the subspace of gauge-invariant states. It is easy to check explicitly that the 
formula (9.59) for the 5-matrix is independent of £: The Ward identity implies 
that any QED matrix element with all external fermions on-shell is unchanged 
if we add to the photon propagator D (Jl, {q) any term proportional to . 

9.5 Functional Quantization of Spinor Fields 

The functional methods that we have used so far allow us to compute, using 
Eq. (9.18) or (9.35), correlation functions involving fields that obey canonical 
commutation relations. To generalize these methods to include spinor fields, 
which obey canonical anticommutation relations, we must do something dif¬ 
ferent: We must represent even the classical fields by anticommuting numbers. 
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Anticommuting Numbers 

We will define anticommuting numbers (also called Grassmann numbers) by 
giving algebraic rules for manipulating them. These rules are formal and might 
seem ad hoc. We will justify them by showing that they lead to the familiar 
quantum theory of the Dirac equation. 

The basic feature of anticommuting numbers is that they anticommute. 
For any two such numbers 8 and /?, 


0rj = -r] 8 . (9.62) 

In particular, the square of any Grassmann number is zero: 

8 2 = 0 . 

(This fact makes algebra extremely easy.) A product (Or}) of two Grassmann 
numbers commutes with other Grassmann numbers. We will also wish to 
add Grassmann numbers, and to multiply them by ordinary numbers; these 
operations have all the properties of addition and scalar multiplication in any 
vector space. 

The main thing we want to do with anticommuting numbers is integrate 
over them. To define functional integration, we do not need general definite 
integrals of these parameters, but only the analog of dx. So let us de¬ 
fine the integral of a general function / of a Grassmann variable 8 , over the 
complete range of 8 : 


J dd f( 8 ) = J cl 8 (A + B 8 ). 

In general, f( 8 ) can be expanded in a Taylor series, which terminates after 
two terms since 8 2 = 0. The integral should be linear in /; thus it must be 
a linear function of A and B. Its value is fixed by one additional property: 
In our analysis of bosonic functional integrals (for instance, in (9.38) and 
(9.54)), we made strong use of the invariance of the integral to shifts of the 
integration variable. We will see in Section 9.6 that this shift invariance of 
the functional integral plays a central role in the derivation of the quantum 
mechanical equations of motion and conservation laws, and thus must be 
considered a fundamental aspect of the formalism. We must, then, demand 
this same property for integrals over 8 . Invariance under the shift 8 —> 8 + 
yields the condition 

J dd (A + B 8 ) = j dd ((A + Bri) + B 8 ). 

The shift changes the constant term, but leaves the linear term unchanged. 
The only linear function of A and B that has this property is a constant 
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(conventionally taken to be 1) times B, so we define* 

Jd6(A + B8)=B. (9.63) 

When we perforin a multiple integral over more than one Grassmann variable, 
an ambiguity in sign arises; we adopt the convention 

JdO Jdr) i)6 = +1, (9.64) 

performing the innermost integral first. 

Since the Dirac field is complex-valued, we will work primarily with com¬ 
plex Grassmann numbers, which can be built out of real and imaginary parts 
in the usual way. It is convenient to define complex conjugation to reverse the 
order of products, just like Hermitian conjugation of operators: 

(Or))* = i)*6* = -6*i)*. (9.65) 


To integrate over complex Grassmann numbers, let us define 

Q _ @i + id-i a* - 6i ~ 182 

U , — i u .— 

s/2 s/2 

We can now treat 6 and 6* as independent Grassmann numbers, and adopt 
the convention f dd*dd( 6 d*) = 1. 

Let us evaluate a Gaussian integral over a complex Grassmann variable: 

j dO* dO e~ rw = j d.0* dO (l - 6 *b 6 ) = j d.0* dO (1 + 66 *b) = b. (9.66) 

If 6 were an ordinary complex number, this integral would equal 2ix/b. The 
factor of 27T is unimportant; the main difference with anticommuting num¬ 
bers is that the b comes out in the numerator rather than the denominator. 
However, if there is an additional factor of 66* in the integrand, we obtain 

Jd. 6 * dd 66 * e~ rbe = 1 = ^ • 6. (9.67) 

The extra 86* introduces a factor of (1/6), just as it does in an ordinary 
Gaussian integral. 

To perform general Gaussian integrals in higher dimensions, we must first 
prove that an integral over complex Grassmann variables is invariant under 
unitary transformations. Consider a set of n complex Grassmann variables 
and a unitary matrix U. If = th.jdj , then 

Y[6'i = —e ii - l e' i e' j ...6\ 

i n\ 


*Tliis definition is due to F. A. Berezin, The Method of Second Quantization, 
Academic Press, New York, 1966. 
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n\ 

• / r ( ; i- ir ... Dr (n^) 

= (det[/)(n^;). (9.68) 

In a general integral 

(il Jd6*d6^jf(6), 

the only term of f(9) that survives has exactly one factor of each and 8*; 
it is proportional to Q~[ 9f) (f] 9*). If we replace 9 by U8, this term acquires a 
factor of (det T T )(det U)* = 1, so the integral is unchanged under the unitary 
transformation. 

We can now evaluate a general Gaussian integral involving a Hermitian 
matrix B with eigenvalues bj: 

(n j f 18* dd^j e = (uj d9* cie^j er^ bi0i = n bi = det B. (9.69) 

(If 9 were an ordinary number, we would have obtained (27r)"/(det B).) Sim¬ 
ilarly, you can show that 

(n Jd9; de^j 8,8; e °’ h ^ = (det B){B~ l ) kl . (9.70) 

Inserting another pair 8 m 9* in the integrand would yield a second factor 
{B~ 1 ) mn , and a second term in which the indices l and n are interchanged 
(the sum of all possible pairings). In general, except for the determinant being 
in the numerator rather than the denominator, Gaussian integrals over Grass- 
mann variables behave exactly like Gaussian integrals over ordinary variables. 

The Dirac Propagator 

A Grassmann field is a function of spacetime whose values are anticommuting 
numbers. More precisely, we can define a Grassmann field if(x) in terms of 
any set of orthonormal basis functions: 

'•(•'•) = (9.71) 

i 

The basis functions <pi(x) are ordinary c-number functions, while the coeffi¬ 
cients ipi are Grassmann numbers. To describe the Dirac field, we take the cpi 
to be a basis of four-component spinors. 

We now have all the machinery needed to evaluate functional integrals, 
and hence correlation functions, involving fermions. For example, the Dirac 
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two-point function is given by 

fVip Vip exp ij d 4 xip{i<fl — m)ip\ ip(xi)ip{x 2 ) 

{0\Tip(x 1 )ip(x 2 ) |0) = ; -—z——- 7— -=---S'-• 

J Vip Vip exp i f d 4 x — m)ip 

(We write Vip instead of Vip* for convenience; the two are unitarily equiva¬ 
lent. We also leave the limits on the time integrals implicit; they are the same 
as in Eq. (9.18), and will yield an it term in the propagator as usual.) The 
denominator of this expression, according to (9.69), is det(i^ — to). The nu¬ 
merator, according to (9.70), is this same determinant times the inverse of 
the operator — Evaluating this inverse in Fourier space, we find the 

familiar result for the Feynman propagator, 

/ Ml, Ap-ik-(x 1-.C2) 

¥ _ m + i€ - {9-72) 

Higher correlation functions of free Dirac fields can be evaluated in a similar 
manner. The answer is always just the sum of all possible full contractions 
of the operators, with a factor of Sp for each contraction, as we found from 
Wick’s theorem in Chapter 4. 

Generating Functional for the Dirac Field 

As with the Klein-Gordon field, we can alternatively derive the Feynman rules 
for the free Dirac theory by means of a generating functional. In analogy with 
(9.34), we define the Dirac generating functional as 

Z[y, ip = J Vip Vip exp j d 4 x \ip(i$ — m)ip + yip + ipy\ J , (9.73) 

where y(x) is a Grassmann-valued source field. You can easily shift ip(x) to 
complete the square, to derive the simpler expression 

Z[y, y] = Z 0 ■ exp d A x d 4 y y(x)S F (x - y)r](y )^, (9.74) 

where, as before, Z 0 is the value of the generating functional with the external 
sources set to zero. 

To obtain correlation functions, we will differentiate Z with respect to y 
and fj. First, however, we must adopt a sign convention for derivatives with 
respect to Grassmann numbers. If y and 6 are anticommuting numbers, let us 
define 

4 - 0y = -4-y0 = -6. (9.75) 

dy dy 

Then referring to the definition (9.73) of Z, we see that the two-point function, 
for example, is given by 

( 0 |r < /,(x,)I(* 3 ) | 0 ) = z„-‘(-ij=Aj) (+i*^y) 


11,71 = 0 
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Plugging in formula (9.74) for Z[fj,rj\ and carefully keeping track of the signs, 
we find that this expression is equal to the Feynman propagator, Sf(x i -lo). 
Higher correlation functions can be evaluated in a similar way. 


QED 


As we saw in Section 9.2 for the case of scalar fields, the functional inte¬ 
gral method allows us to read the Feynman rules for vertices directly from 
the Lagrangian for an interacting field theory. For the theory of Quantum 
Electrodynamics, the full Lagrangian is 

£qed = ~ m)ip - \{F fl v)' 2 

= riioi - m)ip - jj(iv ) 2 - eA>p ll tbA^ 

= C 0 - e-ipY'-ipAf,., 

where D fl = d M + ieA fl is the gauge-covariant derivative. 

To evaluate correlation functions, we expand the exponential of the inter¬ 
action term: 

exp [i f£]= exp \i f L 0 \ [l — ie jd A x ib^ApA^ + • • • J. 

The two terms of the free Lagrangian yield the Dirac and electromagnetic 
propagators derived in this section and the last: 


r d 4 p i e - ip - (x -y) _ 
J (27t) 4 j/ — to -f ie ’ 


d 4 q -i ") 


(27T) 4 


q 2 + ie 


(Feynman gauge). 


The interaction term gives the QED vertex, 


= —ie^ 



As in Chapter 4, we can rearrange these rules, performing the integrations 
over vertex positions to obtain momentum-conserving delta functions, and 
using these delta functions to perform most of the propagator momentum 
integrals. 

The only remaining aspect of the QED Feynman rules is the placement of 
various minus signs. These signs are also built into the functional integral; for 
example, interchanging Of. and Of in Eq. (9.70) would introduce a factor of —1. 
We will see another example of a fermion minus sign in the computation that 
follows. 
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Functional Determinants 

Throughout this chapter we have encountered expressions that we wrote for¬ 
mally as functional determinants. To end this section, let us investigate one 
of these objects more closely. We will find that, at least in this case, we can 
write the determinant explicitly as a sum of Feynman diagrams. 

Consider the object 

j Vip Vip exp \^i j d 4 xip{ifl — m)ip^, (9.76) 

where D fl = d M + ieA^ and A /X (x) is a given external background field. For¬ 
mally, this expression is a functional determinant: 


= det (ip 


to) = det(-i^ — to — e$) 


= det(i^ 




In the last form, the first term is an infinite constant. The second term contains 
the dependence of the determinant on the external field A. We will now show 
that this dependence is well defined and, in fact, is exactly equivalent to the 
sum of vacuum diagrams. 

To demonstrate this, we need only apply standard identities from linear 
algebra. First notice that, if a matrix B has eigenvalues we can write its 
determinant as 


det B = J] bj = exp log = exp [Tr(log B) j, 


(9.77) 


where the logarithm of a matrix is defined by its power series. Applying this 
identity to our determinant, and writing out the power series of the logarithm, 
we obtain^ 


det 11 — 


— in 


r 00 i 

(-“*) =exp 


n =1 


(9.78) 


Alternatively, we can evaluate this determinant by returning to expres¬ 
sion (9.76) and using Feynman diagrams. Expanding the interaction term, we 
obtain the vertex rule 


= i <''' 



lWe use TrQ to denote operator traces, and tr() to denote Dirac traces. 
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Our determinant is then equal to a sum of Feynman diagrams, 


det^l — 


i 0 — m ) 


(9.79) 


The series exponentiates, since the disconnected diagrams are products of con¬ 
nected pieces (with appropriate symmetry factors when a piece is repeated). 
For example, 


Now let us evaluate the nth diagram in the exponent of (9.79). There is a factor 
of —1 from the fermion loop, and a symmetry factor of 1/n since we could 
rotate the interactions around the diagram up to n times without changing 
it. (The factor is not 1/n!, because the cyclic order of the interaction points 
is significant.) The diagram is therefore 


jdx\ ■ ■ ■ dx n tr[(— iafl(xi))S f(x -2 — x\) • • • 
(-iegl(;r„))5i7'(.Ti - x n ) 



n 





(9.80) 


in exact agreement with (9.78), including the minus sign and the symmetry 
factor. 

The computation of functional determinants using Feynman diagrams is 
an important tool, as we will see in Chapter 11. 
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9.6 Symmetries in the Functional Formalism 

We have now seen that the quantum field theoretic correlation functions of 
scalar, vector, and spinor fields can be computed from the functional integral, 
completely bypassing the construction of the Hamiltonian, the Hilbert space of 
states, and the equations of motion. The functional integral formalism makes 
the symmetries of the problem manifest; any invariance of the Lagrangian 
will be an invariance of the quantum dynamics.+ However, we would like to 
be able to appeal also to the conservation laws that follow from the quantum 
equations of motion, or to these equations of motion themselves. For example, 
the Ward identity, which played a major role in our discussion of photons in 
QED (Section 5.5), is essentially the conservation law of the electric charge 
current. Since, as we saw in Section 2.2, the conservation laws follow from 
symmetries of the Lagrangian, one might guess that it is not difficult to derive 
these conservation laws from the functional integral. In this section we will 
see how to do that. We will see that the functional integral gives, in a most 
direct way, a quantum generalization of Noether’s theorem. This result will 
lead to the analogue of the Ward-Takahashi identity for any symmetry of a 
general quantum field theory. 


Equations of Motion 

To prepare for this discussion, we should determine how the quantum equa¬ 
tions of motion follow from the functional integral formalism. As a first prob¬ 
lem to study, let us examine the Green’s functions of the free scalar field. To 
be specific, consider the three-point function: 

(fl| T<p(xi)<p(x- 2 )<p(x 3 ) |0) = Z -1 j n<>< '■! J ' r £ ^ 0 (x 1 )(p(x 2 )(p(x 3 ), (9.81) 

where £ = ^(d^fi) 2 — 4 m 2 0 2 and Z is a shorthand for Z[J = 0], the func¬ 
tional integral over the exponential. In classical mechanics, we would derive 
the equations of motion by insisting that the action be stationary under an 
infinitesimal variation 


<j>(x) —> <p'(x) = <p(x) + e(x). (9.82) 

The appropriate generalization is to consider (9.82) as an infinitesimal change 
of variables. A change of variables does not alter the value of the integral. Nor 
does a shift of the integration variable alter the measure: V<j>' = Vcj). Thus we 
can write 

J Poe’-I't *' 1 o(./• i )o(x 2 )o(x-i) = JV0e’f d4x£ ^ ] 0 l (x 1 )0'(x 2 )0'(x 3 ), 


+ There are some subtle exceptions to this rule, which we will treat in Chapter 19. 
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where tj>' = 0 + e. Expanding this equation to first order in e, we find 


0 = J T><j> e‘-S d 1 £ | (i J d 4 x e(x) {—d 2 —m 2 )<f>(x) <f>(xi) 4 >(x 2 ) 4 >(x 3 ) S j 


+ e{xi)<f)(x- 2 )<j>{x 3 ) + <.i(xi )e(j--j}p(./- :! ) + 0(x 1 )0(x 2 )e(x 3 ) j. 

(9.83) 

The last three terms can be combined with the first by writing, for instance, 
e(;n) = f d 4 x e(x)S(x—x i). Noting that the right-hand side must vanish for 
any possible variation e(x), we then obtain 


0 = J V<f>e'-I d xC [(<9 2 + m 2 ) 0 (x) 0 (xi) 0 (x 2 )<p(x 3 ) 

+ iS{x—X l)4>(X2)4>(x 3 ) + i<f>(xi)S(x — X 2 )<f>(x 3 ) + i0{xi)0{x-2)S{x-X 3 )^ . 

(9.84) 

A similar equation holds for any number of fields 0(xj). 

To see the implications of (9.84), let us specialize to the case of one field 
4>{x\) in (9.81). Notice that the derivatives acting on 0 (x) can be pulled 
outside the functional integral. Then, dividing (9.84) by Z yields the identity 


( d 2 + m 2 ) (fl| T<f>(x) 0 (x i) |fi) = —iS(x — x\). (9.85) 


The left-hand side of this relation is the Klein-Gordon operator acting on a 
correlation function of Six). The right-hand side is zero unless x = aq; that 
is, the correlation function satisfies the Klein-Gordon equation except at the 
point where the arguments of the two <j> fields coincide. The modification of 
the Klein-Gordon equation at this point is called a contact term. In this simple 
case, the modification is hardly unfamiliar to us; Eq. (9.85) merely says that 
the Feynman propagator is a Green’s function of the Klein-Gordon operator, 
as we originally showed in Section 2.4. We saw there that the delta function 
arises when the time derivative in d 2 acts on the time-ordering symbol. We will 
see below that, quite generally in quantum field theory, the classical equations 
of motion for fields are satisfied by all quantum correlation functions of those 
fields, up to contact terms. 

As an example, consider the identity that follows from (9.84) for an (n+1)- 
point correlation function of scalar fields: 

( d 2 + m 2 ) (fi| T0(x)0(xi) ■ ■ ■ <j>(x n ) |fl) 


" (9.86) 

= ^2 (fl| T<f>(x i) • • • (~iS(x - X{)) ■ ■ ■ 0 (x„) |fl). 

i=i 


This identity says that the Klein-Gordon equation is obeyed by (j>(x) inside 
any expectation value, up to contact terms associated with the time ordering. 
The result can also be derived from the Hamiltonian formalism using the 
methods of Section 2.4, or, using the special properties of free-held theory, by 
evaluating both sides of the equation using Wick’s theorem. 
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As long as the functional measure is invariant under a shift of the integra¬ 
tion variable, we can repeat this argument and obtain the quantum equations 
of motion for Green’s functions for any theory of scalar, vector, and spinor 
fields. This is the reason why, in Eq. (9.63), we took the shift invariance to be 
the fundamental, defining property of the Grassmann integral. 

For a general field theory of a field tp(x), governed by the Lagrangian 
£[<p], the manipulations leading to (9.83) give the identity 


0 = J V<f>e'-I d xC |* J d 4 x e(x)-^-^ (^j d 4 x'£j ■ <p(x\ )<p(x 2 ) 


+ e(xi )<f(x 2 ) + <p(x 1 )e(x 2 ) 


}■ 


(9.87) 


and similar identities for correlation functions of n fields. By the rule for 
functional differentiation (9.31), the derivative of the action is 



this is the quantity that equals zero by the Euler-Lagrange equation of motion 
(2.3) for tp. Formula (9.87) and its generalizations lead to the set of identities 


\ n 

d 4 x'C S jtp{x i) • ■■ip(x n )\ = ^2 W( x i)''' { iS ( x ~ x *)) ' ■■ { P( x n)) ■ 

' i= 1 

(9.88) 

In this equation, the angle-brackets denote a time-ordered correlation function 
in which derivatives on c p(x) are placed outside the time-ordering symbol, as in 
Eq. (9.86). Relation (9.88) states that the classical Euler-Lagrange equations 
of the field tp are obeyed for all Green’s functions of tp, up to contact terms 
arising from the nontrivial commutation relations of field operators. These 
quantum equations of motion for Green’s functions, including the proper con¬ 
tact terms, are called Schwinger-Dyson equations. 



Conservation Laws 

In classical field theory, Noether’s theorem says that, to each symmetry of 
a local Lagrangian, there corresponds a conserved current. In Section 2.2 we 
proved Noether’s theorem by subjecting the Lagrangian to an infinitesimal 
symmetry variation. In the spirit of the above discussion of equations of mo¬ 
tion, we should find the quantum analogue of this theorem by subjecting the 
functional integral to an infinitesimal change of variables along the symmetry 
direction. 

Again, it will be most instructive to begin with an example. Let us con¬ 
sider the theory of a free, complex-valued scalar field, with the Lagrangian 

C = \d^\ 2 -nr\4>\ 2 . 


(9.89) 
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This Lagrangian is invariant under the transformation 0 —y e ta 0 . The classical 
consequences of this invariance were discussed in Section 2.2, below Eq. (2.14). 
To find the quantum formulae, consider the infinitesimal change of variables 

<p(x) —> <f> (x ) = <p(x) + ia(x)tj>(x). (9.90) 


Note that we have made the infinitesimal angle of rotation a function of x\ 
the reason for this will be clear in a moment. 

The measure of functional integration is invariant under the transforma¬ 
tion (9.90), since this is a unitary transformation of the variables (j>{x). Thus, 
for the case of two fields, 




4 >' = (l-\-ia)4> 


Expanding this equation to first order in a , we find 


0 = J T> 0 e l f d x C j d 4 x [(c^a) • iicpd 1 'cp* — 0 * d^' <f>)^ 0 (xi)(}>* (x 2 ) 

+ \ia{x\)(j){xi)\<j>*(x 2 ) + <j>(x 1 )[-ia(x 2 )<t>*(x 2 )] j- 

Notice that the variation of the Lagrangian contains only terms proportional 
to d fl a, since the substitution (9.90) with a constant a leaves the Lagrangian 
invariant. To put this relation into a familiar form, integrate the term involving 
d fl a by parts. Then taking the coefficient of a(x) and dividing by Z gives 


{dn.j ,J (x)(l>(x 1 )(l>*(x 2 )) = i)([i<j){xi) 8 {x - xi))<j)*{x 2 ) 

+ 0 (xi)(-i 0 *(x 2 )S(x - x 2 ))Y 

where 

f = i( 0 d fl 0 * - 


(9.91) 


(9.92) 


is the Noether current identified in Eq. (2.16). As in Eq. (9.88), the correlation 
function denotes a time-ordered product with the derivative on j^{x) placed 
outside the time-ordering symbol. Relation (9.91) is the classical conservation 
law plus contact terms, that is, the Schwinger-Dyson equation associated with 
current conservation. 

It is not much more difficult to discuss current conservation in more gen¬ 
eral situations. Consider a local field theory of a set of fields ^p a (x), governed 
by a Lagrangian C[ip]. An infinitesimal symmetry transformation on the fields 
< p a will be of the general form 


<Pa(x) -ri <Pa(x) + eA(fi a (x). 


(9.93) 


We assume that the action is invariant under this transformation. Then, as in 
Eq. (2.10), if the parameter e is taken to be a constant, the Lagrangian must 
be invariant up to a total divergence: 

T[tp] —> C,\tp\ + edjj J 1 '. 


(9.94) 
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If the Symmetry parameter e depends on x, as in the analysis of the previous 
paragraph, the variation of the Lagrangian will be slightly more complicated: 

dr 

c\(p] ->■ C[y] + (<V)A ipa——— + ed fl J^. 

Summation over the index a is understood. Then 

Jd 4 xC[<p + eA ip] = -d tl j^'(x), (9.95) 

where j' 1 is the Noether current of Eq. (2.12), 


f = ^(9.96) 

Using result (9.95) and carrying through the steps leading up to (9.91), we 
find the Schwinger-Dyson equation: 

(dnj 11 (x)iPa{xl)r>b{x- 2 )) = ( ^')((A^.,(.r I )<)(./• - Xi))lfb{x- 2 ) 

X x (9.97) 


+ ip a (xi)(Aip b (x 2 )6(x - x 2 )) 


A similar equation can be found for the correlator of dfj 1 ' with n fields tp(x). 
These give the full set of Schwinger-Dyson equations associated with the clas¬ 
sical Noether theorem. 

As an example of the use of this variational procedure to obtain the 
Noether current, consider the symmetry of the Lagrangian with respect to 
spacetime translations. Under the transformation 


v>a -> v>a + a?{x)d^ a (9.98) 

the Lagrangian transforms as 

dC 

C —> dva^dpipa— -- + aF-d^C. 

d(d v <p a ) 

The variation of / d 4 x £ with respect to a£ then gives rise to the conservation 
equation for the energy-momentum tensor d v T^ v = 0, with 

8 £ 

^ = 7^ -T d^ipa-g^C, (9.99) 

0{d v ip a ) 

in agreement with Eq. (2.17). 

The trick we have used in this section, that of considering a symmetry 
transformation whose parameter is a function of spacetime, is reminiscent 
of a technical feature of our earlier discussion introducing the Lagrangian 
of QED. In Eq. (4.6), we noted that the minimal coupling prescription for 
coupling the photon to charged fields produces a Lagrangian invariant not 
only under the global symmetry transformation with e constant, but also 
under a transformation in which the symmetry parameter depends on x. In 
Chapter 15, we will draw these two ideas together in a general discussion of 
field theories with local symmetries. 
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The Ward-Takahashi Identity 

As a final application of the methods of this section, let us derive the 
Schwinger-Dyson equations associated with the global symmetry of QED. 
Consider making, in the QED functional integral, the change of variables 


ip(x) —> (1 + iea(x))ip(x), 


(9.100) 


without the corresponding term in the transformation law for A fl (which 
would make the Lagrangian invariant under the transformation). The QED 
Lagrangian (4.3) then transforms according to 

C >C i.\ (9.101) 


The transformation (9.100) thus leads to the following identity for the func¬ 
tional integral over two fermion fields: 

0 = jVipV'ipVAe t f d xC | — ijdPxd^a^x) |j' m (t)^(xi)^(%)J 

+ (iea(xi)pj(xi))^p(x-2) + ^p(xi)(-iea(x-2)w(xo))^, (9.102) 

with j 11 = eipj^tp. As in our other examples, an analogous equation holds for 
any number of fermion fields. 

To understand the implications of this set of equations, consider first the 
specific case (9.102). Dividing this relation by Z, we find 


(0 Tj"(.v)v(x\)c(.r->) |0) 


- ieS(x - xi) (0| Tip(zi)ip{x 2 ) |0) 
+ ieS(x - .To) (0| Tip(xi)ip{x&§ |0). 


To put this equation into a more familiar form, compute its Fourier transform 
by integrating: 




-\-iq-x i 



-ip-x 2 


(9.104) 


Then the amplitudes in (9.103) are converted to the amplitudes M(k;p;q) 
and M(p;q) defined below (7.67) in our discussion of the Ward-Takahashi 
identity. Indeed, (9.103) falls directly into the form 


—ikfjM^{k-,p]q) = —ieMo(p;q — k)+ieMo(p + k',q). (9.105) 


This is exactly the Ward-Takahashi identity for two external fermions, which 
we derived diagrammatically in Section 7.4. It is not difficult to check that 
the more general relations involving n fermion fields lead to the general Ward- 
Takahashi identity presented in (7.68). Because of this relation, the formula 
(9.97) associated with the arbitrary symmetry (9.93) is usually also referred 
to as a Ward-Takahashi identity, the one associated with the symmetry and 
its Noether current. 
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We have now arrived at a more general understanding of the terms on the 
right-hand side of the Ward-Takahashi identity. These are the contact terms 
that we now expect to find when we convert classical equations of motion 
to Schwinger-Dyson equations for quantum Green’s functions. The functional 
integral formalism allows a simple and elegant derivation of these quantum- 
mechanical terms. 

Problems 

9.1 Scalar QED. This problem concerns the theory of a complex scalar field (f> 
interacting with the electromagnetic field A fl . The Lagrangian is 

£ = I'?,,, + - m 2 

where D ft = 0^ + ieA M is the usual gauge-covariant derivative. 

(a) Use the functional method of Section 9.2 to show that the propagator of the 
complex scalar field is the same as that of a real field: 

i 

p 2 — m 2 + ie 

Also derive the Feynman rules for the interactions between photons and scalar 
particles; you should find 


=—ie(p + p') IJ '; = 'lie 2 . 


(b) Compute, to lowest order, the differential cross section for e + e —5- 00* ■ Ignore 
the electron mass (but not the scalar particle’s mass), and average over the 
electron and positron polarizations. Find the asymptotic angular dependence 
and total cross section. Compare your results to the corresponding formulae for 
e + e“ —y pA p~. 

(c) Compute the contribution of the charged scalar to the photon vacuum polar¬ 
ization, using dimensional regularization. Note that there are two diagrams. To 
put the answer into the expected form, 

n^? 2 ) = (g^q 2 - g'V)n(r), 

it is useful to add the two diagrams at the beginning, putting both terms over 
a common denominator before introducing a Feynman parameter. Show that, 
for — q 2 m 2 , the charged boson contribution to II (q 2 ) is exactly 1/4 that of a 
virtual electron-positron pair. 

9.2 Quantum statistical mechanics. 

(a) Evaluate the quantum statistical partition function 

Z = tr[e~^ H ] 



Problems 313 


(where (3 = 1/kT) using the strategy of Section 9.1 for evaluating the matrix 
elements of e~ lHt in terms of functional integrals. Show that one again finds a 
functional integral, over functions defined on a domain that is of length j3 and 
periodically connected in the time direction. Note that the Euclidean form of 
the Lagrangian appears in the weight. 

Evaluate this integral for a simple harmonic oscillator, 

r 1-2,122 

Lie = 2 X T x > 

by introducing a Fourier decomposition of x(t): 

= E a '« • ^ e ' 2nint/p - 

The dependence of the result on /3 is a bit subtle to obtain explicitly, since the 
measure for the integral over x(t) depends on j3 in any discretization. However, 
the dependence on u> should be unambiguous. Show that, up to a (possibly di¬ 
vergent and /^-dependent) constant, the integral reproduces exactly the familiar 
expression for the quantum partition function of an oscillator. [You may find the 
identity 



useful.] 

Generalize this construction to field theory. Show that the quantum statistical 
partition function for a free scalar field can be written in terms of a functional 
integral. The value of this integral is given formally by 

r . -i — ^/- 
det( d 2 ■ in 2 ) , 

where the operator acts on functions on Euclidean space that are periodic in the 
time direction with periodicity (3. As before, the (3 dependence of this expression 
is difficult to compute directly. However, the dependence on m 2 is unambiguous. 
(More generally, one can usually evaluate the variation of a functional determi¬ 
nant with respect to any explicit parameter in the Lagrangian.) Show that the 
determinant indeed reproduces the partition function for relativistic scalar par¬ 
ticles. 

Now let i p(t), w(t) be two Grassmann-valued coordinates, and define a fermionic 
oscillator by writing the Lagrangian 

Le = ft + u> 4'i’- 

This Lagrangian corresponds to the Hamiltonian 

H = Ldf'ip, with {'</>, g>} = 1; 

that is, to a simple two-level system. Evaluate the functional integral, assuming 
that the fermions obey antiperiodic boundary conditions: V’(f + ft) = — V’W- 
(Why is this reasonable?) Show that the result reproduces the partition function 
of a quantum-mechanical two-level system, that is, of a quantum state with Fermi 
statistics. 
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(e) Define the partition function for the photon field as the gauge-invariant func¬ 
tional integral 

Z = jvA exp (-j [i(iv) 2 ]) 

over vector fields that are periodic in the time direction with period j3. 
Apply the gauge-fixing procedure discussed in Section 9.4 (working, for example, 
in Feynman gauge). Evaluate the functional determinants using the result of 
part (c) and show that the functional integral does give the correct quantum 
statistical result (including the correct counting of polarization states). 
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While computing radiative corrections in Chapters 6 and 7, we encountered 
three QED diagrams with ultraviolet divergences: 


In each case we saw that the divergence could be regulated and canceled, 
yielding finite expressions for measurable quantities. In Chapter 8, we pointed 
out that such ultraviolet divergences occur commonly and, in fact, naturally 
in quantum field theory calculations. We sketched a physical interpretation of 
these divergences, with implications both in quantum field theory and in the 
statistical theory of phase transitions. In the next few chapters, we will convert 
this sketchy picture into a quantitative theory that allows precise calculations. 

In this chapter, we begin this study by developing a classification of the 
ultraviolet divergences that can appear in a quantum field theory. Rather than 
stumbling across these divergences one by one and repairing them case by case, 
we now set out to determine once and for all which diagrams are divergent, 
and in which theories these divergences can be eliminated systematically. As 
examples we will consider both QED and scalar field theories. 

10.1 Counting of Ultraviolet Divergences 

In this section we will use elementary arguments to determine, tentatively, 
when a Feynman diagram contains an ultraviolet divergence. We begin by 
analyzing quantum electrodynamics. 

First we introduce the following notation, to characterize a typical dia¬ 
gram in QED: 

N e = number of external electron lines; 

1V 7 = number of external photon lines; 

P e = number of electron propagators; 

P-y = number of photon propagators; 

V = number of vertices; 

L = number of loops. 
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(This analysis applies to correlation functions as well as scattering amplitudes. 
In the former case, propagators that are connected to external points should 
be counted as external lines, not as propagators.) 

The expression corresponding to a typical diagram looks like this: 

f d A k i d 4 k 2 ■ ■ ■ d'k, 

~ J (fc-m)-.. (**)•• •(*£)' 


For each loop there is a potentially divergent 4-momentum integral, but each 
propagator aids the convergence of this integral by putting one or two pow¬ 
ers of momentum into the denominator. Very roughly speaking, the diagram 
diverges unless there are more powers of momentum in the denominator than 
in the numerator. Let us therefore define the superficial degree of divergence , 
D , as the difference: 


D = (power of k in numerator) — (power of k in denominator) 
= 4 L-P e - 2 P 1 . 


( 10 . 1 ) 


Naively, we expect a diagram to have a divergence proportional to A D , where 
A is a momentum cutoff, when D > 0. We expect a divergence of the form 
log A when D = 0, and no divergence when D < 0. 

This naive expectation is often wrong, for one of three reasons (see 
Fig. 10.1). When a diagram contains a divergent subdiagram, its actual di¬ 
vergence may be worse than that indicated by D. When symmetries (such 
as the Ward identity) cause certain terms to cancel, the divergence of a dia¬ 
gram may be reduced or even eliminated. Finally, a trivial diagram with no 
propagators and no loops has D = 0 but no divergence. 

Despite all of these complications, D is still a useful quantity. To see why, 
let us rewrite it in terms of the number of external lines (N e , JV 7 ) and vertices 
(V). Note that the number of loop integrations in a diagram is 


/. P e + P- V + 1, 


( 10 . 2 ) 


since in our original Feynman rules each propagator has a momentum integral, 
each vertex has a delta function, and one delta function merely enforces overall 
momentum conservation. Furthermore, the number of vertices is 

V = 2P 1 , + N 1 = \(2P e +N e ), (10.3) 

since each vertex involves exactly one photon line and two electron lines. (The 
propagators count twice since they have two ends on vertices.) Putting these 
relations together, we find that D can be expressed as 

D = 4 (P e +P 1 - V + 1) - P e - 2P~ I 

= 4- W 7 - | N e , 


(10.4) 
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Figure 10.1. Some simple QED diagrams that illustrate the superficial de¬ 
gree of divergence. The first diagram is finite, even though D = 0. The third 
diagram has D = 2 but only a logarithmic divergence, due to the Ward iden¬ 
tity (see Section 7.5). The fourth diagram diverges, even though D < 0, since 
it contains a divergent subdiagram. Only in the second and fifth diagrams 
does the superficial degree of divergence coincide with the actual degree of 
divergence. 

independent of the number of vertices. The superficial degree of divergence of 
a QED diagram depends only on the number of external legs of each type. 

According to result (10.4), only diagrams with a small number of external 
legs have D > 0; those seven types of diagrams are shown in Fig. 10.2. Since 
external legs do not enter the potentially divergent integral, we can restrict our 
attention to amputated diagrams. We can also restrict our attention to one- 
particle-irreducible diagrams, since reducible diagrams are simple products 
of the integrals corresponding to their irreducible parts. Thus the task of 
enumerating all of the divergent QED diagrams reduces to that of analyzing 
the seven types of amputated, one-particle-irreducible amplitudes shown in 
Fig. 10.2. Other diagrams may diverge, but only when they contain one of 
these seven as a subdiagram. Let us therefore consider each of these seven 
amplitudes in turn. 

The zero-point function, Fig. 10.2a, is very badly divergent. But this ob¬ 
ject merely causes an unobservable shift of the vacuum energy; it never con¬ 
tributes to 5-matrix elements. 

To analyze the photon one-point function (Fig. 10.2b), note that the ex¬ 
ternal photon must be attached to a QED vertex. Neglecting the external 
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Figure 10.2. The seven QED amplitudes whose superficial degree of di¬ 
vergence (D) is > 0. (Each circle represents the sum of all possible QED 
diagrams.) As explained in the text, amplitude (a) is irrelevant to scattering 
processes, while amplitudes (b) and (d) vanish because of symmetries. Am¬ 
plitude (e) is nonzero, but its divergent parts cancel due to the Ward identity. 

The remaining amplitudes (c, f, and g) are all logarithmically divergent, even 
though D > 0 for (c) and (f). 

photon propagator, this amplitude is therefore 

= -ie j d'x, <Q| T |fi), (10.5) 

where j fl = is the electromagnetic current operator. But the vacuum 

expectation value of j 11 must vanish by Lorentz invariance, since otherwise it 
would be a preferred 4-vector. 

The photon one-point function also vanishes for a second reason: charge- 
conjugation invariance. Recall that C is a symmetry of QED, so C |fi) = |fi). 
But j^{x) changes sign under charge conjugation, Cj*‘(x)C^ = — j M (: x), so its 
vacuum expectation value must vanish: 

{VL\Tf{x) |fi) = (Q|C' t Ci"(.'c)C' t C'|Q) = - (n\Tj tl {x) |fi) = 0. 

The same argument applies to any vacuum expectation value of an odd num¬ 
ber of electromagnetic currents. In particular, the photon three-point function, 
Fig. 10.2d, vanishes. (This result is known as Furry’s theorem.) It is not hard 
to check explicitly that the photon one- and three-point functions vanish in 
the leading order of perturbation theory (see Problem 10.1). 

The remaining amplitudes in Fig. 10.2 are all nonzero, so we must analyze 
their structures in more detail. Consider, for example, the electron self-energy 
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(Fig. 10.2f). This amplitude is a function of the electron momentum p, so let 
us expand it in a Taylor series about p = 0: 


= A 0 + A r fl+ A 2 p 2 + ■■■ , 


where each coefficient is independent of p: 

A =±*lf 
' " n! df l V 

(These coefficients are infrared divergent; to compute them explicitly we would 
need an infrared regulator, as in Chapter 6.) The diagrams contributing to the 
electron self-energy depend on p through the denominators of propagators. 
To compute the coefficients A n , we differentiate these propagators, giving 
expressions like 

±( 1 )= _I_. 

That is, each derivative with respect to the external momentum p lowers the 
superficial degree of divergence by 1. Since the constant term A 0 has (su¬ 
perficially) a linear divergence, A\ can have only a logarithmic divergence; 
all the remaining A„ are finite. (This argument breaks down when the di¬ 
vergence is in a subdiagram, since then not all propagators involve the large 
momentum k. We will face this problem in Section 10.4.) 

The electron self-energy amplitude has one additional subtlety. If the con¬ 
stant term A 0 were proportional to A (the ultraviolet cutoff), the electron mass 
shift would, according to the analysis in Section 7.1, also have a term propor¬ 
tional to A. But the electron mass shift must actually be proportional to m, 
since chiral symmetry would forbid a mass shift if m were zero. At worst, the 
constant term can be proportional mlogA. We therefore expect the entire 
self-energy amplitude to have the form 

= flfljn log A + dialog A + (finite terms), (10.6) 

exactly what we found for the term of order a in Eq. (7.19). 

Let us analyze the exact electron-photon vertex, Fig. 10.2g, in the same 
way. (Again we implicitly assume that infrared divergences have been regu¬ 
lated.) Expanding in powers of the three external momenta, we immediately 
see that only the constant term is divergent, since differentiating with respect 
to any external momentum would lower the degree of divergence to —1. This 
amplitude therefore contains only one divergent constant: 



cx —ieY log A + finite terms. 


(10.7) 
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As discussed in Section 7.5, the photon self energy (Fig. 10.2c) is con¬ 
strained by the Ward identity to have the form 

= Of V - g'V)n(<r). (10.8) 


Viewing this expression as a Taylor series in q , we see that the constant and 
linear terms both vanish, lowering the superficial degree of divergence from 
2 to 0. The only divergence, therefore, is in the constant term of II(q’ 2 ), and 
this divergence is only logarithmic. This result is exactly what we found for 
the lowest-order contribution to n(g 2 ) in Eq. (7.90). 

Finally, consider the photon-photon scattering amplitude, Fig. 10.2e. The 
Ward identity requires that if we replace any external photon by its momen¬ 
tum vector, the amplitude vanishes: 


! 


k tl 



V J 


(10.9) 


By exhaustion one can show that this condition is satisfied only if the ampli¬ 
tude is proportional to (g^k 0 — g^ a k v ), with a similar factor for each of the 
other three legs. Each of these factors involves one power of momentum, so 
all terms with less than four powers of momentum in the Taylor series of this 
amplitude must vanish. The first nonvanishing term has D = 0 — 4 = —4, and 
therefore this amplitude is finite. 

In summary, we have found that there are only three “primitively” di¬ 
vergent amplitudes in QED: the three that we already found in Chapters 6 
and 7. (Other amplitudes may also be divergent, but only because of dia¬ 
grams that contain these primitive amplitudes as components.) Furthermore, 
the dependence of these divergent amplitudes on external momenta is ex¬ 
tremely simple. If we expand each amplitude as a power series in its external 
momenta, there are altogether only four divergent coefficients in the expan¬ 
sions. In other words, QED contains only four divergent numbers. In the next 
section we will see how these numbers can be absorbed into unobservable 
Lagrangian parameters, so that observable scattering amplitudes are always 
finite. 

For the remainder of this section, let us try to understand the superficial 
degree of divergence from a more general viewpoint. The theory of QED in 
four spacetime dimensions is rather special, so let us first generalize to QED 
in cl dimensions. In this case, D is given by 


D = dL - P e - 2P ~, (10.10) 

since each loop contributes a d-dimensional momentum integral. Relations 
(10.2) and (10.3) still hold, so we can again rewrite D in terms of V, N e , 
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and JV 7 . This time the result is 

(io.li, 

The cancellation of V in this expression is special to the case d = 4. For d < 4, 
diagrams with more vertices have a lower degree of divergence, so the total 
number of divergent diagrams is finite. For d > 4, diagrams with more vertices 
have a higher degree of divergence, so every amplitude becomes superficially 
divergent at a sufficiently high order in perturbation theory. 

These three possible types of ultraviolet behavior will also occur in other 
quantum field theories. We will refer to them as follows: 

Super-Renormalizable theory: Only a finite number of Feynman 

diagrams superficially diverge. 

R(-normalizable theory: Only a finite number of amplitudes 

superficially diverge; however, diver¬ 
gences occur at all orders in perturba¬ 
tion theory. 

Non-Renormalizable theory: All amplitudes are divergent at a 

sufficiently high order in perturbation 
theory. 

Using this nomenclature, we would say that QED is renormalizable in four 
dimensions, super-renormalizable in less than four dimensions, and non- 
renormalizable in more than four dimensions. 

These superficial criteria give a correct picture of the true divergence 
structure of the theory for most cases that have been studied in detail. Exam¬ 
ples are known in which the true behavior is better than this picture suggests, 
when powerful symmetries set to zero some or all of the superficially divergent 
amplitudes.* On the other hand, as we will explain in Section 10.4, it is always 
true that the divergences of superficially renormalizable theories can be ab¬ 
sorbed into a finite number of Lagrangian parameters. For theories containing 
fields of spin 1 and higher, loop diagrams can produce additional problems, 
including violation of unitarity; we will discuss this difficulty in Chapter 16. 

As another example of the counting of ultraviolet divergences, consider a 
pure scalar field theory, in d, dimensions, with a d>" interaction term: 

£ = \{d^f - (UV 2 - V n ■ (10.12) 

2 2 n\ 

Let N be the number of external lines in a diagram, P the number of prop¬ 
agators, and V the number of vertices. The number of loops in a diagram is 
L = P — V + 1. There are n lines meeting at each vertex, so nV = N + 2 P. 

*Some exotic four-dimensional field theories are actually free of divergences; see, 
for example, the article by P. West in Shelter Island II, R. Jackiw, N. N. Khuri, S. 
Weinberg, and E. Witten, eds. (MIT Press, Cambridge, 1985). 
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Combining these relations, we find that the superficial degree of divergence of 
a diagram is 

D = dL — 2 P 

r (d— 2\ 1 (d— 2\ (10.13) 

= d + Kt)"T -(—)"■ 

In four dimensions a ( j ) 4 coupling is renormalizable, while higher powers of 
0 are non-renormalizable. In three dimensions a 0 ° coupling becomes renor¬ 
malizable, while <f> A is super-renormalizable. In two spacetime dimensions any 
coupling of the form 0 n is super-renormalizable. 

Expression (10.13) can also be derived in a somewhat different way, from 
dimensional analysis. In any quantum field theory, the action S = f d d x £ 
must be dimensionless, since we work in units where Ti = 1. In this system of 
units, the integral d d x has units (mass) -6 *, and so the Lagrangian has units 
(mass) 6 *. Since all units can be expressed as powers of mass, it is unambiguous 
to say simply that the Lagrangian has “dimension d”. Using this result, we 
can infer from the explicit form of (10.12) the dimensions of the field cp and the 
coupling constant A. From the kinetic term in £ we see that 0 has dimension 
(d— 2)/2. Note that the parameter m consistently has dimensions of mass. 
From the interaction term and the dimension of 0 , we infer that the A has 
dimension d — n(d— 2)/2. 

Now consider an arbitrary diagram with N external lines. One way that 
such a diagram could arise is from an interaction term rj(f> N in the Lagrangian. 
The dimension of r/ would then be d — N(d— 2)/2, and therefore we con¬ 
clude that any (amputated) diagram with N external lines has dimension 
d — N(d— 2)/2. In our theory with only the A cp n vertex, if the diagram has 
V vertices, its divergent part is proportional to A 1 A D , where A is a high- 
momentum cutoff and D is the superficial degree of divergence. (This is the 
“generic” case; all the exceptions noted above also apply here.) Applying di¬ 
mensional analysis, we find 

<-*(¥)=^-«(¥)] + * 

in agreement with (10.13). 

Note that the quantity that multiplies V in this expression is just the 
dimension of the coupling constant A. This analysis can be carried out for 
QED and other field theories, with the same result. Thus we can characterize 
the three degrees of renormalizability in a second way: 

Super-Renormalizable: Coupling constant has positive mass dimen¬ 
sion. 

Renormalizable: Coupling constant is dimensionless. 

Non-Renormalizable: Coupling constant has negative mass dimen¬ 


sion. 
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This is exactly the conclusion that we stated without proof in Section 4.1. In 
QED, the coupling constant e is dimensionless; thus QED is (at least super¬ 
ficially) renormalizable. 


10.2 Renormalized Perturbation Theory 

In the previous section we saw that a renormalizable quantum field theory con¬ 
tains only a small number of superficially divergent amplitudes. In QED, for 
example, there are three such amplitudes, containing four infinite constants. 
In Chapters 6 and 7 these infinities disappeared by the end of our compu¬ 
tations: The infinity in the vertex correction diagram was canceled by the 
electron field-strength renormalization, while the infinity in the vacuum po¬ 
larization diagram caused only an unobservable shift of the electron’s charge. 
In fact, it is generally true that the divergences in a renormalizable quantum 
field theory never show up in observable quantities. 

To obtain a finite result for an amplitude involving divergent diagrams, 
we have so far used the following procedure: Compute the diagrams using a 
regulator, to obtain an expression that depends on the bare mass (mo), the 
bare coupling constant (eo), and some ultraviolet cutoff (A). Then compute the 
physical mass (m) and the physical coupling constant (e), to whatever order 
is consistent with the rest of the calculation; these quantities will also depend 
on mo, eo, and A. To calculate an S'-matrix element (rather than a correlation 
function), one must also compute the field-strength renormalization(s) Z (in 
accord with Eq. (7.45)). Combining all of these expressions, eliminate mo 
and eo in favor of m and e; this step is the “renormalization”. The resulting 
expression for the amplitude should be finite in the limit A —>■ oo. 

The above procedure always works in a renormalizable quantum field 
theory. However, it can often be cumbersome, especially at higher orders in 
perturbation theory. In this section we will develop an alternative procedure 
which works more automatically. We will do this first for cj) 4 theory, returning 
to QED in the next section. 

The Lagrangian of <f> 4 theory is 

£ = ~ ~ Y"’- 

We now write mo and Ao, to emphasize that these are the bare values of the 
mass and coupling constant, not the values measured in experiments. 

The superficial degree of divergence of a diagram with N external legs is, 
according to (10.13), 

D = 4-N. 


Since the theory is invariant under <f> —)• —0, all amplitudes with an odd 
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number of external legs vanish. The only divergent amplitudes are therefore 


Ignoring the vacuum diagram, these amplitudes contain three infinite con¬ 
stants. Our goal is to absorb these constants into the three unobservable pa¬ 
rameters of the theory: the bare mass, the bare coupling constant, and the 
field strength. To accomplish this goal, it is convenient to reformulate the 
perturbation expansion so that these unobservable quantities do not appear 
explicitly in the Feynman rules. 

First we will eliminate the shift in the field strength. Recall from Sec¬ 
tion 7.1 that the exact two-point function has the form 

/ d 4 x (fil T0{x)<t>( 0) Ifi) e ,p ' x = —r— —y + (terms regular at p 2 = m 2 ), 

J p- — m- 

(10.14) 

where m is the physical mass. We can eliminate the awkward residue Z from 
this equation by rescaling the field: 

4 = Z 1/2 d>,,. (10.15) 

This transformation changes the values of correlation functions by a factor 
of Z -1 / 2 for each field. Thus, in computing 5-matrix elements, we no longer 
need the factors of Z in Eq. (7.45); a scattering amplitude is simply the sum 
of all connected, amputated diagrams, exactly as we originally guessed in 
Eq. (4.103). 

The Lagrangian is much uglier after the rescaling: 

L = \z{d,4r? ~ \mlz 4 l - (10.16) 

The bare mass and coupling constant still appear in £, but they can be elim¬ 
inated as follows. Define 

6 z = Z-l, S m =mlZ-m 2 , S\ = A 0 Z 2 — A, (10.17) 

where m and A are the physically measured mass and coupling constant. Then 
the Lagrangian becomes 

^ = \id,4rf ~ \m 2 6l - ±<>l 

+ \b Z (.d,4>r? ~ \Srn4l ~ %/r- 


( 10 . 18 ) 
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Figure 10.3. Feynman rules for </> 4 theory in renormalized perturbation 
theory. 

The first line now looks like the familiar ^-theory Lagrangian, but is written 
in terms of the physical mass and coupling. The terms in the second line, 
known as counterterms , have absorbed the infinite but unobservable shifts 
between the bare parameters and the physical parameters. It is tempting to 
sav that we have “added” these counterterms to the Lagrangian, but in fact 
we have merely split each term in (10.16) into two pieces. 

The definitions in (10.17) are not useful unless we give precise definitions 
of the physical mass and coupling constant. Equation (10.14) defines m 2 as the 
location of the pole in the propagator. There is no obviously best definition 
of A, but a perfectly good definition would be obtained by setting A equal to 
the magnitude of the scattering amplitude at zero momentum. Thus we have 
the two defining relations, 


= - - + (terms regular at p 2 = m 2 ); 

p- — m- 

= —iX at s = 4m 2 , t = u = 0. (10.19) 


These equations are called renormalization conditions. (The first equation 
actually contains two conditions, specifying both the location of the pole and 
its residue.) 

Our new Lagrangian, Eq. (10.18), gives a new set of Feynman rules, shown 
in Fig. 10.3. The propagator and the first vertex come from the first line of 
(10.18), and are identical to the old rules except for the appearance of the 
physical mass and coupling in place of the bare values. The counterterms in 
the second line of (10.18) give two new vertices (also called counterterms). 

We can use these new Feynman rules to compute any amplitude in 0 A 
theory. The procedure is as follows. Compute the desired amplitude as the 
sum of all possible diagrams created from the propagator and vertices shown 
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in Fig. 10.3. The loop integrals in the diagrams will often diverge, so one 
must introduce a regulator. The result of this computation will be a function 
of the three unknown parameters Sz , S m , and S\. Adjust (or “renormalize”) 
these three parameters as necessary to maintain the renormalization condi¬ 
tions (10.19). After this adjustment, the expression for the amplitude should 
be finite and independent of the regulator. 

This procedure, using Feynman rules with counterterms, is known as 
renormalized perturbation theory. It should be contrasted with the procedure 
we used in Part 1, outlined at the beginning of this section, which is called 
bare perturbation theory (since the Feynman rules involve the bare mass and 
coupling constant). The two methods are completely equivalent. The differ¬ 
ences between them are purely a matter of bookkeeping. You will get the 
same answers using either procedure, so you may choose whichever you find 
more convenient. In general, renormalized perturbation theory is technically 
easier to use, especially for multiloop diagrams; however, bare perturbation 
theory is sometimes easier for complicated one-loop calculations. We will use 
renormalized perturbation theory in most of the rest of this book. 

One-Loop Structure of (f ) 4 Theory 

To make more sense of the renormalization procedure, let us carry it out 
explicitly at the one-loop level. 

First consider the basic two-particle scattering amplitude, 


If we define p = pi + p -2 , then the second diagram is 

_ {-iX) 2 r d 4 k _ i _ i 

2 J (2n) 4 k 2 — to 2 ( k + p) 2 — m 2 

= (-i\)' 2 ■ iV(p 2 ). (10.20) 

Note that p 2 is equal to the Mandelstam variable s. The next two diagrams 
are identical, except that s will be replaced by t and u. The entire amplitude 
is therefore 

iM = -iX + (-iX) 2 [-iF(s) + iV(t) + iV(u)] - iS x . (10.21) 

According to our renormalization condition (10.19), this amplitude should 
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equal —iX at s = Am 2 and t = u = 0. We must therefore set 

S x = -A 2 [V(4m 2 ) + 2V(0)]. (10.22) 

(At higher orders, S\ will receive additional contributions.) 

We can compute V(p 2 ) explicitly using dimensional regularization. The 
procedure is exactly the same as in Section 7.5: Introduce a Feynman param¬ 
eter, shift the integration variable, rotate to Euclidean space, and perform the 
momentum integral. We obtain 


V(P 2 ) 


1 

2 

1 

2 



d d k 1 

(2 ^) d ^2 2 xk ■ p + xp 2 — m 2 ] 2 

d d i _1_ 

(27r) d ^2 x (l—x)p 2 — m 2 Y 


(£ = k + xp) 


1 j dx fdME_ _ 1 _ 

2 ./ J i'^ 7r ) d [f 2 E — x(l—x)p 2 + 7n' 2 Y 


(4 = -if 0 ) 


iLhM)_1_ 

27 (4 ”Y /1 -xil-xlp 1 ] 1 d ' 7 


1 

—> -^2 J dx { \ “ 7 + log(4?r) - log [m 2 - x{l-x)p 2 ] j , (10.23) 
0 

where e = 4 — d. The shift in the coupling constant (10.22) is therefore 

A 2 r(2-f) r f _ 1 _ 2 A 

A 2 {Any/' 2 J X \[m 2 - x(l-x)Am 2 ] 2 ~ d / 2 [™ 2 ] 2 - d / 2 ) 

0 

1 

(£r(- — 3y + 31og(47r) — log[m 2 — x(l— x)4m 2 ] — 21og[m 2 ] 

(10.24) 


A 2 


d -¥4 327T 2 


/ 


These expressions are divergent as d —> A. But if we combine them according 
to (10.21), we obtain the finite (if rather complicated) result, 

i 


iM = —i\ — 


iX 2 

32tt 2 


J dx log ^ 


2 — x(l—x)s 
<m 2 — x(l— x)Anv 


■) +log(- 


2 — x(l— x)t ' 


m- 


o 


+ 



—x(l—x)u 
m 2 


) 


(10.25) 
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To determine Sz and S m we must compute the two-point function. As in 
Section 7.2, let us define i .1 / 2 (/r} as the sum of all one-particle-irreducible 
insertions into the propagator: 


= —iM 2 (p 2 ). (10.26) 

Then the full two-point function is given by the geometric series, 


i 

p 2 — m 2 — M 2 (p 2 ) 


(10.27) 


The renormalization conditions (10.19) require that the pole in this full prop¬ 
agator occur at p 2 = mr and have residue 1. These two conditions are equiv¬ 
alent, respectively, to 

MV)|^ =ma = 0 and A M V)U = , m2 = 0. (10.28) 

(To check the latter condition, expand M 2 about p 2 = m 2 in Eq. (10.27).) 
Explicitly, to one-loop order, 


—iM 2 (p 2 ) = 


= —iX • 


1 


d d k 


+ i{p 2 S z - S m ) 


(2n) d k 2 — m 2 

1 , v 2 t ^ 

“ 2(4 7r) d / 2 (m 2 y- d / 2+l(p Z m) ' 

Since the first term is independent of pi 2 , the result is rather trivial: 


(10.29) 

Setting 


fe = 0 a „d *„, = - w/1 4, (10.30) 

yields M 2 (p 2 ) = 0 for all p 2 , satisfying both of the conditions in (10.28). 

The first nonzero contributions to M 2 (p 2 ) and Sz are proportional to A 2 , 
coming from the diagrams 


(10.31) 


The second diagram contains the S\ counterterm, which we have already com¬ 
puted. It cancels ultraviolet divergences in the first diagram that occur when 
one of the loop momenta is large and the other is small. The third diagram 
is again the {p 2 Sz — S m ) counterterm, and is fixed to order A 2 by requiring 



10.2 Renormalized Perturbation Theory 329 


that the remaining divergences (when both loop momenta become large) can¬ 
cel. In Section 10.4 we will see an explicit example of the interplay of various 
counterterms in a two-loop calculation. 

The vanishing of Sz at one-loop order is a special feature of ( ft 4 theory, 
which does not occur in more general theories of scalar fields. The Yukawa the¬ 
ory described in Section 4.7 gives an explicit example of a one-loop correction 
for which this counterterm is required. 

In the Yukawa theory, the scalar field propagator receives corrections at 
order g 2 from a fermion loop diagram and the two propagator counterterms. 
Using the Feynman rules on p. 118 to compute the loop diagram, we find 

—iM 2 (p 2 ) = 


= -4fif 


■ < - 9)3 /« 

cl d k 


/' d d k 

/ tr 

i{ty + ^ + m/| i(f/ + in f) 

J (2n) d 

{k+p) 2 — mj k 2 — mj 


7 


k ■ (p + k) + mj 


(2n) d (( p+k) 2 — mj)(k' 2 — mj ) 


+ i(p 2 S z - S m ) 


+ i(p 2 S z -S m ), (10.32) 


where to/ is the mass of the fermion that couples to the Yukawa field. To 
evaluate the integral, combine denominators and shift as in Eq. (10.23). Then 
the first term in the last line becomes 


-Ag : 


hi 


d d ( C 2 - x(l-x)p 2 + mj 
(2ir) d (£ 2 + x(l—x)p 2 — mj) 2 

fr(i-f) 


= —4 g / dx 


7 


(47r) d / 2 V A 1_d / 2 


Aig 2 (d-1) (' Id f) 


( 47 r ) d / 2 


h 


A' <i/2 


A r(2—1)\ 

A 2 -d / 2 ) 


(10.33) 


where A = mj — x(l—x)p 2 . 

Now we can see that both of the counterterms S m and Sz must take 
nonzero values in order to satisfy the renormalization conditions (10.28). To 
determine S m , we subtract the value of the loop diagram at p 2 = m 2 as before, 
so that 


S 


m 


Ag 2 (d—1) 

( 47 r ) d / 2 



rd-f) 

[mj — x(\—x)m 2 ] 1 ~ d / 2 


+ m 2 Sz- 


(10.34) 


To determine Sz, we cancel also the first derivative with respect to p 2 of the 
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loop integral (10.33). This gives 

4 g 2 (d-i) ) -c(i-.c)r( 2 -f) 

Z (47r) d / 2 J [ml — x(l—x)m 2 ] 2 ~ d / 2 

o 1 

/ 2 2 o o 

d,xx{l—x) (- 7 — - + log( 47 r) — log[m^ — x(l—x)m 2 ] 

o 

(10.35) 

Thus, in Yukawa theory, the propagator corrections at one-loop order require 
a quadratically divergent mass renormalization and a logarithmically diver¬ 
gent field strength renormalization. This is the usual situation in scalar field 
theories. 

10.3 Renormalization of Quantum Electrodynamics 

The procedure we followed in the previous section, yielding a “renormalized” 
perturbation theory formulated in terms of physically measurable parameters, 
can be summarized as follows: 

1. Absorb the field-strength renormalizations into the Lagrangian by rescal¬ 
ing the fields. 

2. Split each term of the Lagrangian into two pieces, absorbing the infinite 
and unobservable shifts into counterterms. 

3. Specify the renormalization conditions, which define the physical masses 
and coupling constants and keep the field-strength renormalizations equal 
to 1 . 

4. Compute amplitudes with the new Feynman rules, adjusting the counter¬ 
terms as necessary to maintain the renormalization conditions. 

Let us now use this procedure to construct a renormalized perturbation theory 
for Quantum Electrodynamics. 

The original QED Lagrangian is 

£ = -jiF^) 2 + ip(i$ - m 0 )4 - e 0 ?/’7''V-V 

Computing the electron and photon propagators with this Lagrangian, we 
would find expressions of the general form 

iZ-2 _ _ —iZ 3 g ia , 

^ — m ’ q 2 

(We found just such expressions in the explicit one-loop calculations of Chap¬ 
ter 7.) To absorb Z- 2 and Z 3 into £, and hence eliminate them from formula 
(7.45) for the S'-matrix, we substitute tp = zl^ipr and Af' = z\^A^. Then 
the Lagrangian becomes 

£ = -\Z 3 {Fr? + Z-u-pFi/) - mo)tl>r ~ e 0 Z 2 Z 1 / 2 ^ rl ^ r A^. 


_^_3r r 

d-Vl 47T 2 J 


(10.36) 
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We can introduce the physical electric charge e, measured at large distances 
(q = 0), by defining a scaling factor Zi as follows: 1 ' 

eoZ 2 Z 3 1/2 = eZ 1 . (10.37) 


If we let m be the physical mass (the location of the pole in the electron 
propagator), then we can split each term of the Lagrangian into two pieces as 
follows: 


C = -\{F,D 2 + ip r {i$ - m)ip r - ei/j r ^i/j r A rfJ 

~ \h( F r) 2 + $^$20- S m )tp r ~ ed^p r ^ fl ^ r A riI , 


(10.38) 


where 


S 3 =Z 3 -1, So = Z-2 — 1 , 

S m = Z 2 TTI 0 — m, and S\ = Z\ — 1 = (eo /e)Z 2 Z\^ — 1. 

The Feynman rules for renormalized QED are shown in Fig. 10.4. In 
addition to the familiar propagators and vertex, there are three counterterm 
vertices. The ee and eey counterterm vertices can be read directly from the La¬ 
grangian (10.38). To derive the two-photon counterterm, integrate — \( F nv) 2 
by parts to obtain — ^A lx (—8 2 g , “' + d IJ d I ')A„; this gives the expression shown 
in the figure. In the remainder of the book, when we set up renormalized per¬ 
turbation theory, we will drop the subscript r used here to distinguish the 
rescaled fields. 

Each of the four counterterm coefficients must be fixed by a renormaliza¬ 
tion condition. The four conditions that we require have already been stated 
implicitly: Two of them fix the electron and photon field-strength renormal¬ 
izations to 1, while the other two define the physical electron mass and charge. 
To write these conditions more explicitly, recall our notation from Chapters 
6 and 7: 


(10.39) 


f Since we define e by tlie renormalization condition T^(q = 0) = 7 ^, the factor 
of Z\ in the Lagrangian must cancel the multiplicative correction factor that arises 
from loop corrections. Therefore this definition of Z\ is equivalent to that given in 
Eq. (7.47). 
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Figure 10.4. Feynman rules for Quantum Electrodynamics in renormalized 
perturbation theory. 

These amplitudes are now to be computed in renormalized perturbation the¬ 
ory; that is, we are now redefining n(g 2 ), E(j^), and T(p',p) to include coun¬ 
terterm vertices. Furthermore, the new definition of T involves the physical 
electron charge. With this notation, the four conditions are 


£(y = m) 




n(r = 0) = 0; 


—ieT ,J (p' — p = 0) = —iey. 


(10.40) 


The first condition fixes the electron mass at m, while the next two fix the 
residues of the electron and photon propagators at 1. Given these conditions, 
the final condition fixes the electron charge to be e. 


One-Loop Structure of QED 

The four conditions (10.40) allow us to determine the four countertems in 
(10.38) in terms of the values of loop diagrams. In Chapters 6 and 7 we com¬ 
puted all of the diagrams required to carry out this determination to one-loop 
order. We will now collect these results and find explicit expressions for the 
renormalization constants of QED to order a. For overall consistency, we will 
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use dimensional regularization to control ultraviolet divergences, and a pho¬ 
ton mass // to control infrared divergences. In Part I, we computed the vertex 
and self-energy diagrams using the Pauli-Villars regularization scheme, before 
introducing dimensional regularization. Now we have an opportunity to quote 
the values of these diagrams as computed with dimensional regularization. 

The first two conditions involve the electron self-energy. We evaluated 
the one-loop diagram contributing to E(p), using a Pauli-Villars regulator, in 
Section 7.1; the result is given in Eq. (7.19). If we re-evaluate the diagram 
in dimensional regularization, we find some additional terms in the Dirac 
algebra from the modified contraction identities (7.89). Taking these terms 
into account, we find for this diagram (e = 4 — d) 


-iEo(p) = -i- 


dx ■ 


r( 2 -f) 


(4ir) d / 2 J ‘ ((l— x)m 2 + x/jr — x{l—x)p 2 )' 2 ~ d /' 2 
o 

x ((4— e)m — (2 —e)xtf). (10.41) 

Therefore, according to the first of conditions (10.40), 


mS 2 - S m = S 2 (m) = 


e 2 m 

(4n) d / : 


1 

I th 


r(2-f) • (4 — 2x — e(l—»)) 
((1— x) 2 m' 2 + x/i 2 ) 2 ~ d /' 2 


(10.42) 


Similarly, the second of conditions (10.40) determines So- 


S, = 


1 

h 


r( 2 -f) 


(4ir) d / 2 

o 

x [(2— e)x — 


((1 —x) 2 m 2 -I-.»•//-)- d / 2 
e 2x(l —x)m 2 


2 (1— x) 2 m' 2 + x/i? 


(4 — 2x — e(l—x))l. (10.43) 


Notice that the second term in the brackets gives a finite result as e —»■ 0, 
because it multiplies the divergent gamma function. 

The third condition of (10.40) requires the value (7.90) of the photon 
self-energy diagram: 

i 

^2 r r(2_—) 

n 2 (q 2 ) = t~t \ , /9 / dx -— --- 2 / (8x(l-x)). 

(4 7 r )d / 2 J (rn?‘ — x(l— x)q 2 ) 2 ~ d / 2 v y 


Then 


s 3 = n 2 (o) = - 


(47r) d / : 


/ 


dx ( ^ 2 ) 2 -d/ 3 (8x(l-:r)). 


(10.44) 



334 Chapter 10 Systematic^ of Renormalization 


The last condition requires the value of the electron vertex function, computed 
in Section 6.3. Again, we will rework the diagram in dimensional regulariza¬ 
tion. Then the shift in the form factor Fi(q 2 ) (6.56) becomes 

e 2 f 

$Fi(q 2 ) = y n y /2 J dx dy dz 6(x+y+z- 1) 

T(3— i) 

+ A 3 -d /2 (g ,2 [ 2 ( 1 ~ iE K 1 ~y) - ex y] + m 2 [2(l-4z+z 2 ) - e(l-2:) 2 ]) , 

(10.45) 


T(2—f) (2—f) 
A 2 ~ d / 2 2 


where A = (l—z) 2 m 2 + zfi 2 — xyq 2 as before. The fourth renormalization 
condition then determines 


6i = 


—AFi(O) = - 

+ 


( 47 r ) rf / 2 

r(3-f) 


jdz (1 z) 


1(2 f ) 


(2-e) 2 


[((1— z) 2 m 2 + zy' 2 ) 2 ~ d / 2 


((1 —z) 2 m 2 + zy' 2 ) 3 ~ d / 2 


[2(l-4z+z 2 )-e(l-z) 2 ]m 2 


(10.46) 


Using an integration by parts similar to that following Eq. (7.32), one can 
show explicitly from (10.46) and (10.43) that Si = So, that is, that Zi = Z 2 
to order a. As in our previous derivations, this formula follows from the Ward 
identity. The Lagrangian (10.38), with counterterms set to zero, is gauge in¬ 
variant. If the regulator is also gauge invariant (and we do use dimensional 
regularization), this implies the Ward identity for diagrams without counter¬ 
term vertices. In particular, this implies that SFi(0) = —d£ 2 /dtf | m . Then the 
counterterms <5i and So, which are required to cancel these two factors, will 
be set equal. 

By continuing this argument, it is straightforward to construct a full dia¬ 
grammatic proof that <5i = So, to all orders in renormalized perturbation the¬ 
ory, using the method we applied in Section 7.4 to prove the Ward-Takahashi 
identity in bare perturbation theory. With a generalization of the argument 
given there, one can show that the diagrammatic identity (7.68) holds for di¬ 
agrams that include counterterm vertices in loops. Thus, if the counterterms 
(5i and So are determined up to order a n , the unrenormalized vertex diagram 
at q 2 = 0 equals the derivative of the unrenormalized self-energy diagram 
on-shell in order o"" 1 . To satisfy the renormalization conditions (10.40), we 
must then set the counterterms Si and So equal to order a n+1 . This recur¬ 
sive argument gives yet another proof that Zi = Zo to all orders in QED 
perturbation theory. 

The relation (10.37) between the bare and renormalized charge 


e = 


—Z. 

Zi 


1/2 


eo 


(10.47) 
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gives a further physical interpretation of the identity Z\ = Z 2 . Using the 
identity, we can rewrite (10.47) as 

e = '/^3 e o ; 

which is just the relation (7.76) that we derived by a diagrammatic argument 
in Section 7.5. This says that the relation between the bare and renormalized 
electric charge depends only on the photon field strength renormalization, 
not on quantities particular to the electron. To see the importance of this 
observation, consider writing the renormalized quantum electrodynamics with 
two species of charged particles, say, electrons and muons. Then, in addition 
to (10.37), we will have a relation for the photon-muon vertex: 

eZ^Z^ 2 = e 0 Z[~\ (10.48) 

where Z' x and Z 2 are the vertex and field strength renormalizations for the 
muon. Each of these two constants depends on the mass of the muon, so 
(10.48) threatens to give a different relation between eo and e from the one 
written in (10.47). However, the Ward identity forces the factors Z[ and Z! 2 
to cancel out of this relation, leaving over a universal electric charge which 
has the same value for all species. 

10.4 Renormalization Beyond the Leading Order 

In the last two sections we have developed an algorithm for computing scat¬ 
tering amplitudes to any order in a renormalizable field theory. We have seen 
explicitly that this algorithm yields finite results at the one-loop level in both 
<f> 1 theory and QED. According to the naive analysis of Section 10.1, the al¬ 
gorithm should also work at higher orders. But that analysis ignored many 
of the intricacies of multiloop diagrams; specifically, it ignored the fact that 
diagrams can contain divergent subdiagrams. 

When an otherwise finite diagram contains a divergent subdiagram, the 
treatment of the divergence is relatively straightforward. For example, the 
sum of diagrams 


(10.49) 


is finite: The divergence in the photon propagator cancels just as when this 
propagator occurs in a tree diagram. The finite sum of the two propagator 
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diagrams gives an integrand for the outer loop that falls off fast enough that 
this integral still converges. 

A more difficult situation occurs when we have nested or overlapping 
divergences, that is, when two divergent loops share a propagator. Some ex¬ 
amples of diagrams with overlapping divergences are 


in cf) 4 theory; 


in QED. 


To see the difficulty, consider the photon self-energy diagram: 


One contribution to this diagram comes from the region of momentum space 
where fc 2 is very large. This means that, in position space, ;r, y, and 2 : are very 
close together, while w can be farther away. In this region we can think of the 
virtual photon as giving a correction to the vertex at x. We saw in Section 6.3 
that this vertex correction is logarithmically divergent, of the form 


~ —fey'' • a log A 2 


in the limit A —)■ 00 . Plugging this vertex into the rest of the diagram and 
integrating over k \, we obtain an expression identical to the one-loop photon 
self-energy correction II 2 (g 2 ), displayed in (7.90), multiplied by the additional 
logarithmic divergence: 


~ ot{g^q 2 - g ; V)n 2 (g 2 ) • a log A 2 


(10.50) 


- Q IJ Q ")(log A 2 + logg 2 ) • adog A 2 . 
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The log 2 A 2 term comes from the region where both k\ and k -2 are large, while 
the logg 2 log A 2 term comes from the region where k -2 is large but k\ is small. 
Another such term would come from the region where k\ is large but kn is 
small. 

The appearance of terms proportional to n 2 ((/ 2 ) • log A 2 in the two-loop 
vacuum polarization diagram contradicts our naive argument, based on the 
criterion of the superficial degree of divergence, that the divergent terms of 
a Feynman integral are always simple polynomials in q 2 . We will refer to di¬ 
vergences multiplying only polynomials in q 2 as local divergences , since their 
Fourier transforms back to position space are delta functions or derivatives 
of delta functions. We will call the new, nonpolynomial, term a nonlocal di¬ 
vergence. Fortunately, our derivation of the nonlocal divergent term gave this 
term a physical interpretation: It is a local divergence surrounded by an or¬ 
dinary, nondivergent, quantum field theory process. 

If this picture accurately describes all of the divergent terms of the two- 
loop diagram, we should expect that these divergences are canceled by two 
types of counterterm diagrams. First, we can build diagrams of order a 2 by in¬ 
serting the order-a counterterm vertex into the one-loop vacuum polarization 
diagram: 


These diagrams should cancel the nonlocal divergence in (10.50) and the cor¬ 
responding contribution from the region where k\ is large and k -2 is small. In 
fact, a detailed analysis shows that the sum of the original diagram and these 
two counterterm diagrams contains only local divergences. Once these dia¬ 
grams are added, the only divergence that remains is a local one, which can 
be canceled by the diagram 


that is, by adding an order-a 2 term to ^ 3 . 

We can extend the lessons of this example to a general picture of the 
divergences of higher-loop Feynman diagrams and their cancellation. A given 
diagram may contain local divergences, as predicted by the analysis of Section 
10.1. It may also contain nonlocal divergences due to divergent subgraphs 
embedded in loops carrying small momenta. These divergences are canceled by 
diagrams in which the divergent subgraphs are replaced by their counterterm 
vertices. One might still ask two questions: First, does this procedure remove 
all nonlocal divergences? Second, does this procedure preserve the finiteness 
of amplitudes, such as (10.49), that are not expected to be divergent by the 
superficial criteria of Section 10.1? To answer these questions requires an 
intricate study of nested Feynman integrals. The general analysis was begun 
by Bogoliubov and Parasiuk, completed by Hepp, and elegantly refined by 



338 Chapter 10 Systematic^ of Renormalization 


Zimmermann;* they showed that the answer to both questions is yes. Their 
result, known as the BPHZ theorem, states that, for a general renormalizable 
quantum field theory, to any order in perturbation theory, all divergences are 
removed by the counterterm vertices corresponding to superficially divergent 
amplitudes. In other words, any superficially renormalizable quantum field 
theory is in fact rendered finite when one performs renormalized perturbation 
theory with the complete set of counterterms. 

The proof of the BPHZ theorem is quite technical, and we will not include 
it in this book. Instead, we will investigate one detailed example of a two-loop 
calculation, which demonstrates explicitly the appearance and cancellation of 
nonlocal divergences. 

10.5 A Two-Loop Example 

To illustrate the issues discussed in the previous section, let us consider the 
two-loop contribution to the four-point function in <j> 4 theory. There are 16 rel¬ 
evant diagrams, shown in Fig. 10.5. (There are also several diagrams involving 
the one-loop correction to the propagator. But each of these is exactly can¬ 
celed by its counterterm, as we saw in Eq. (10.29), so we can just ignore them.) 
Fortunately, many of the diagrams are simply related to each other. Crossing 
symmetry reduces the number of distinct diagrams to only six, 


(10.51) 


where the last diagram denotes only the s-channel piece of the second-order 
vertex counterterm. If this sum of diagrams is finite, then simply replacing s 
with t or u gives a finite result for the remaining diagrams. 

The value of the last diagram in (10.51) is just a constant, which we can 
freely adjust to absorb any divergent terms that are independent of the exter¬ 
nal momenta. Our goal, therefore, is to show that all momentum-dependent 
divergent terms cancel among the remaining five diagrams. 

The fourth and fifth diagrams in (10.51) involve the one-loop vertex coun¬ 
terterm, which we computed in Eq. (10.24). Let us briefly recall that compu¬ 
tation. We defined iV(p 2 ) as the fundamental loop integral, 


= (-i\y 2 -iv(p 2 ) 


(~i\f 


LiEJHzl) L_I_1 

. 2(4tt) d / 2 y [m'2 - x(l-x)p 2 ] 2 d/2 . 

(10.52) 


IN. N. Bogoliubov and O. S. Parasiuk, Acta Math. 97, 227 (1957); K. Hepp, 
Comm. Math. Phvs. 2, 301 (1966); W. Zimmermann, in Deser, et. al. (1970). 
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Figure 10.5. The two-loop contributions to the four-point function in <f> A 
theory. Note that the diagrams in the first three lines are related to each 
other by crossing, being in the s-, t-, and w-channels, respectively. The last 
two diagrams in each of these lines involve the 0( A 2 ) vertex counterterm, 
while the final diagram is the 0( A 3 ) contribution to the vertex counterterm. 

The counterterm, according to the renormalization condition (10.19), had 
to cancel the three one-loop diagrams (one for each channel) at threshold 
(s = 4m 2 , t = u = 0); thus we found 


= -iS x = (-iX) 2 [—iV (4m 2 ) - 2iV(0)]. 


For our present purposes it will be convenient to separate the two terms of 
this expression. Let us therefore define 


= (— -iA) 2 • —iV(4m 2 ); 


= (-iX) 2 ■ ~‘2iV(0). 


We can now divide the first five diagrams in (10.51) into three groups, as 
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follows: 


We will find that all divergent terms that depend on momentum cancel sep¬ 
arately within each group. Since Groups II and III are related by a simple 
interchange of initial and final momenta, it suffices to demonstrate this can¬ 
cellation for Groups I and II. 

Group I is actually quite easy, since each diagram factors into a product 
of objects we have already computed. Referring to Eq. (10.52), we have 


= (- i A) 3 .[W(r)] 2 ; 


= (—iX) 3 ■ iV(p 2 ) ■ —iV(4m 2 ). 


The sum of all three diagrams is therefore 


(-U) 3 ([iVV )] 2 - 2iV{p 2 )iV(4m 2 ) S ) 

= (-i\f(-[V(p 2 )-V(4m 2 )]' 2 + [y(4m 2 )] 2 ). 


(10.53) 


But the difference V(p 2 ) — V(4m 2 ) is finite, as was required for the cancellation 
of divergences in the one-loop calculation: 


V( p 2 ) - V (4m 2 ) 


1 f f mr — x(l— x)p 2 A 

32 ^ J dx ° S ^m 2 -x(l-x)4m 2 ) ' 
o 


The only remaining divergence is in the term [V(4m 2 )] 2 , which is independent 
of momentum and can therefore be absorbed into the second-order counter¬ 
term in (10.51). 
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Two general properties of result (10.53) are worth noting. First, the di¬ 
vergent piece (and hence the L1(A 3 ) vertex counterterm) is proportional to 

[V(4m 2 )]“ oc [r(2— 4)]* -—* (^-) for d = 4 — e. 

This is a double pole, in contrast to the simple pole we found for the one-loop 
counterterm. Higher-loop diagrams will similarly have higher-order poles, but 
in all cases the divergent terms are momentum-independent constants. Second, 
consider the large-momentum limit, 

V(p 2 ) - V{Am 2 ) ~ log -~ 7 . 

p 2 ^ oc. m- 

The two-loop vertex is proportional to log 2 (p 2 /m 2 ). A diagram of this struc¬ 
ture with n loops will have the form 


A n+1 



This asymptotic behavior is actually a generic property of multiloop diagrams, 
which we will explore in more detail in Chapter 12. 

Now consider the more difficult diagram, from Group II: 


( * A)3 / (27 r) d k 2 - m 2 (k+p) 2 - m 2 ^ ^ k+p3 ^' 

(10.54) 

In evaluating this diagram, we will combine denominators in the manner that 
makes it most straightforward to extract the divergent terms, at the price 
of complicating the evaluation of the finite parts. Another approach to the 
calculation of this diagram is discussed in Problem 10.4. 

To begin the evaluation of (10.54), combine the pair of denominators 
shown explicitly, and substitute expression (10.52) for V(p 2 ). This gives the 
expression 

A 3 T(2-f) j r fjdk _1_ 

2 (47r) d / 2 ./ X J V J (2tt ) d [^2 + 2 yk-p + yp 2 - to 2 ] 2 

1 

[m 2 — x(l— x)(k+p3) 2 ] ~ 2 


X 


(10.55) 
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It is possible to combine this pair of denominators by using the identity 


1 

A a B 3 


i 

o 


w q '~ 1 ( 1 - w ) /3 ~ 1 T(a+0) 

[wA + (l-w)B] a+l3 r (a) r (/3) ‘ 


(10.56) 


This is a special case of the formula quoted in Section 6.3, Eq. (6.42). To prove 
it, change variables in the integral: 


wA 

wA + (1 —w)B ’ 


(1-4) 


(1 -w)B 
wA + (1 —w)B ’ 


d.z 


AB dw 

[wA + (1— w)B] ~ 


so that 
i 

h 

0 


w a 1 (l—w)P 1 
[wA + (l-ru)B] a+P 


j dzz « '(1 -z)‘ ! 1 = -^B(a,0) : 
o 


where B(a,/3 ) is the beta function, Eq. (7.82). The more general identity 
(6.42) can be proved by induction. 

Applying identity (10.56) to (10.55), we obtain 


A 3 T(4—f) 

T (47r) d / 2 


111 
./ dX / dy / dw J 


d d k 

('2Tr) d 


Hl-w) 


(w\m 2 —x(l—x)(k+p 3 )' 2 ] + (1— w)[ni 2 —k' 2 — 2yk-p—yp' 2 ]) 2 

(10.57) 

Completing the square in the denominator yields a polynomial of the form 


— [(1— w) + wx(l— :r)]f 2 — P 2 +m 2 , (10.58) 

where £ is a shifted momentum variable and P 2 is a rather complicated func¬ 
tion of p, p 3 , and the various Feynman parameters. It will only be important 
for this analysis that, as w —> 0, 


P 2 (w) =y(l-y)p 2 + 0(w), (10.59) 


and this can be seen easily from (10.57). Changing variables to £, Wick- 
rotating, and performing the integral, we eventually obtain 


i\ 3 

"2(4t r)‘ 


111 
./ dX 1 dy / dw 


i(l-w) 


E(4 -d) 


[l — w + wx{l— x)]^ > ‘ J ) 


4—d ' 


(10.60) 


This expression has one obvious pole as d —>■ 4, coming from the gamma 
function. However, it also has a less obvious pole, coming from the zero end 
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of the w integral. Let us write (10.60) as 

l 

jdw u ,1_ 2 f(w), 

0 

where f(w) incorporates all the factors not displayed explicitly. To isolate the 
pole at w = 0, we can add and subtract /(0): 


1 11 

j dw la 1-7 f(w) = J dww 1_ - /(0) + j dww 1 -^ [f{w) — /(0)]. (10.61) 


0 0 
The second piece is 

ill 


i\ 3 T{4-d) 
2(47r) d 


./ dX / dy J dw w 1 


ooo 

(l-w) 


[l - w + wx{l-x)] d ^ [m 2 - P' 2 (w)] 1 d [m 2 - P 2 (0)] 1 d 


This term has only a simple pole as d —> 4; the residue of the pole is a 
momentum-independent constant, obtained by setting d = 4 everywhere ex¬ 
cept in T(4 —d). We can therefore absorb this divergence into the 0( A 3 ) vertex 
counterterm. (The finite part of this expression has a very complicated de¬ 
pendence on momentum, but we do not need to work this out to complete 
our argument.) 

We are left with only the first term of (10.61). This expression contains 
only P 2 (0), which is given by (10.59). The w integral in this term is straight¬ 
forward, and the x integral is trivial. With e = 4 — d, our remaining expression 
is 


*A 3 (2\ j T(e) 

2 ( 47 r) d \ e) J \m 2 — y{l—y)p 2 V 

° i (10.62) 

0 

where we have kept only the divergent terms in the second line. The logarithm, 
multiplied by the pole 2/e, is the nonlocal divergence that we worried about 
in Section 10.4. 

Fortunately, we must still add to this the “t + u” counterterm diagram 
of Group II. The computation of that diagram is by now a straightforward 
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process: 


= (-;a) 3 • -2iV(o) ■ iV( P 2 ) 


= »a 3 ) r(2-f) r(2-f) 

2(47r) d J V [ TO 2 ] 2 - d / 2 [ m 2 _ y{ i_ y yp 2 ^- d /' 2 

0 

x {?- - 7 -log[m 2 -y(l-y)p 2 ]y (10.63) 

(Again we have dropped finite terms from the last line.) This expression also 
contains a nonlocal divergence, given by the first pole times the second log¬ 
arithm. It exactly cancels the nonlocal divergence in (10.62). The remaining 
terms are all either finite, or divergent but independent of momentum. This 
completes the proof that the two-loop contribution to the four-point function 
is finite. 

The two features of the Group I diagrams appear here in Group II as 
well. The divergent pieces of (10.62) and (10.63) contain double poles that do 
not cancel, so we again find that the second-order vertex counterterm must 
contain a double pole. The finite pieces of (10.62) and (10.63) contain double 
logarithms, so we again find that the two-loop amplitude behaves as A 3 log 2 p 2 
as p —> oo. 

Problems 

10.1 One-loop structure of QED. In Section 10.1 we argued from general princi¬ 
ples that the photon one-point and three-point functions vanish, while the four-point 
function is finite. 

(a) Verify directly that the one-loop diagram contributing to the one-point func¬ 
tion vanishes. There are two Feynman diagrams contributing to the three-point 
function at one-loop order. Show that these cancel. Show that the diagrams 
contributing to any ?r-point photon amplitude, for n odd, cancel in pairs. 

(b) The photon four-point amplitude is a sum of six diagrams. Show explicitly that 
the potential logarithmic divergences of these diagrams cancel. 

10.2 Renormalization of Yukawa theory. Consider the pseudoscalar Yukawa La- 
grangian, 

C = \j(d^(i>) 2 - \m 2 <i> 2 + M)f - 

where <p is a real scalar field and (/' is a Dirac fermion. Notice that this Lagrangian is 
invariant under the parity transformation f,x) —> 7 °^>(t,— x), 0(t,x) —> — x), 
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in which the field <f> carries odd parity. 

(a) Determine the superficially divergent amplitudes and work out the Feynman 
rules for renormalized perturbation theory for this Lagrangian. Include all nec¬ 
essary counterterm vertices. Show that the theory contains a superficially diver¬ 
gent 4c6 amplitude. This means that the theory cannot be renormalized unless 
one includes a scalar self-interaction, 



and a counterterm of the same form. It is of course possible to set the renor¬ 
malized value of this coupling to zero, but that is not a natural choice, since the 
counterterm will still be nonzero. Are any further interactions required? 

(b) Compute the divergent part (the pole as d —> 4) of each counterterm, to the one- 
loop order of perturbation theory, implementing a sufficient set of renormaliza¬ 
tion conditions. You need not worry about finite parts of the counterterms. Since 
the divergent parts must have a fixed dependence on the external momenta, you 
can simplify this calculation by choosing the momenta in the simplest possible 
way. 

10.3 Field-strength renormalization in (j> 4 theory. The two-loop contribution to 
the propagator in < p 4 theory involves the three diagrams shown in (10.31). Compute the 
first of these diagrams in the limit of zero mass for the scalar field, using dimensional 
regularization. Show that, near d = 4, this diagram takes the form: 

= ~ ip2 [“7 + logp2 + ‘"]’ 

with e = 4 — d. The coefficient in this equation involves a Feynman parameter integral 
that can be evaluated by setting d = 4. Verify that the second diagram in (10.31) 
vanishes near d = 4. Thus the first diagram should contain a pole only at' <6 = 0, which 
can be canceled by a field-strength renormalization counterterm. 

10.4 Asymptotic behavior of diagrams in <f> 4 theory. Compute the leading 
terms in the S-matrix element for boson-boson scattering in 0' 1 theory in the limit 
s —> co, f fixed. Ignore all masses on internal lines, and keep external masses nonzero 
only as infrared regulators where these are needed. Show that 

. A” . oA 3 o 

iM(s,t) ~ —i\ - i--, log * - log- * + •••• 

Notice that ignoring the internal masses allows some pleasing simplifications of the 
Feynman parameter integrals. 
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Now that we have determined the general structure of the ultraviolet diver¬ 
gences of quantum field theories, it would seem natural to continue investi¬ 
gating the implications of these divergences in Feynman diagram calculations. 
However, we will now put this issue aside until Chapter 12 and set off in what 
may seem an unrelated direction. In Chapter 8 and in Section 9.3, we noted the 
formal relation between quantum field theory and statistical mechanics. The 
closest formal analogue of a scalar field theory was seen to be the continuum 
description of a ferromagnet or some other system that allows a second-order 
phase transition. This analogy raises the possibility that in quantum field the¬ 
ory as well it may be possible for the field to take on a nonzero global value. 
As in a magnet, this global field might have a directional character, and thus 
violate a symmetry of the Lagrangian. In such a case, we say that the field 
theory has hidden or spontaneously broken symmetry. We devote this chapter 
to an analysis of this mechanism of symmetry violation. 

Spontaneously broken symmetry is a central concept in the study of quan¬ 
tum field theory, for two reasons. First, it plays a major role in the applications 
of quantum field theory to Nature. In this book, we will see two very differ¬ 
ent examples of such applications: Chapter 13 will apply the theory of hidden 
symmetry to statistical mechanics, specifically to the behavior of thermody¬ 
namic variables near second-order phase transitions. Later, in Chapter 20, we 
will see that hidden symmetry is an essential ingredient in the theory of the 
weak interactions. Spontaneous symmetry breaking also finds applications in 
the theory of the strong interactions, and in the search for unified models of 
fundamental physics. 

But spontaneous symmetry breaking is also interesting from a theoretical 
point of view. Quantum field theories with spontaneously broken symmetry 
contain ultraviolet divergences. Thus, it is natural to ask whether these diver¬ 
gences are constrained by the underlying symmetry of the theory. The answer 
to this question, first presented by Benjamin Lee,* will give us further insights 
into the nature of ultraviolet divergences and the meaning of renormalization. 


*A beautiful summary of Lee’s analysis is given in liis lecture note volume: B. Lee, 
Chiral Dynamics (Gordon and Breach, New York, 1972). 
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11.1 Spontaneous Symmetry Breaking 


We begin with an analysis of spontaneous symmetry breaking in classical field 
theory. Consider first the familiar < p 4 theory Lagrangian, 

£ = \{dn4>) 2 - ^m 2 <0 - ^4> 4 , 

but with m 2 replaced by a negative parameter, — /j 2 : 

£ = \(d>,<P )' 2 + \v 2 r ~ (H.l) 

This Lagrangian has a discrete symmetry: It is invariant under the operation 
4> —> — <p. The corresponding Hamiltonian is 

+ 1 (V« 3 - i„V s + . 

The minimum-energy classical configuration is a uniform field (p(x) = <po, with 
(po chosen to minimize the potential 

V(4>) = -\v 2 <f 2 + 

(see Fig. 11.1). This potential has two minima, given by 



<t> o = ±v = ± 



( 11 . 2 ) 


The constant v is called the vacuum expectation value of <p. 

To interpret this theory, suppose that the system is near one of the minima 
(say the positive one). Then it is convenient to define 


0(x) = v + a(x), 


(11.3) 


and rewrite £ in terms of a(x). Plugging (11.3) into (11.1), we find that the 
term linear in a vanishes (as it must, since the minimum of the potential is 
at a = 0), Dropping the constant term as well, we obtain the Lagrangian 


£ = t;(<V) 2 - ^(2/r)cr - \j^^ 3 ~ ^ycr 4 . (11.4) 

This Lagrangian describes a simple scalar field of mass with cr 3 and <r 4 

interactions. The symmetry cp —> — cp is no longer apparent; its only manifes¬ 
tation is in the relations among the three coefficients in (11.4), which depend 
in a special way on only two parameters. This is the simplest example of a 
spontaneously broken symmetry. 
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Figure 11.1. Potential for spontaneous symmetry breaking in the discrete 
case. 


The Linear Sigma Model 

A more interesting theory arises when the broken symmetry is continuous, 
rather than discrete. The most important example is a generalization of the 
preceding theory called the linear sigma model, which we considered briefly 
in Problem 4.3. We will study this model in detail throughout this chapter. 

The Lagrangian of the linear sigma model involves a set of N real scalar 
field ft(x): 

C = \{d,ftf + (11-5) 

with an implicit sum over i in each factor {ft)' 2 . Note that we have rescaled the 
coupling A from the ft theory Lagrangian to remove the awkward factors of 6 
in the analysis above. The Lagrangian (11.5) is invariant under the symmetry 

ft R ij ft (11.6) 

for any N x N orthogonal matrix R. The group of transformations (11.6) 
is just the rotation group in N dimensions, also called the IV-dimensional 
orthogonal group or simply O(N). 

Again the lowest-energy classical configuration is a constant field ft Q , 
whose value is chosen to minimize the potential 

v{ft) = -\ft{ftY + \[{ft?} 2 

(see Fig. 11.2). This potential is minimized for any ft 0 that satisfies 

(ftft 1 = y ■ 

This condition determines only the length of the vector ft 0 ] its direction is 
arbitrary. It is conventional to choose coordinates so that ft 0 points in the 
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Figure 11.2. Potential for spontaneous breaking of a continuous O(N) sym¬ 
metry, drawn for the case A r = 2. Oscillations along the trough in the potential 
correspond to the massless n fields. 

IVth direction: 

ft'o = (0,0,... ,0,u), where v = ~^=. (11.7) 

We can now define a set of shifted fields by writing 

ft(x) = (rr k (x),v + a(x)), k = l,...,N — l. (11.8) 

(The notation, as in Problem 4.3, comes from the application of this formalism 
to pions in the case N = 4.) 

It is now straightforward to rewrite the Lagrangian (11.5) in terms of the 
7r and a fields. The result is 

£ = l ( d ^ k ) 2 + l( d n°Y 2 - -j(W 2 

_ " ; \ \ \ (n-9) 

- n/A ^ 3 - VXfJ.{TT k ) 2 a - -a 4 - —(tt k )' 2 a 2 - - [(tt^) 2 ] T 

We obtain a massive a field just as in (11.4), and also a set of N —1 massless 
tt fields. The original O(N) symmetry is hidden, leaving only the subgroup 
0(N— 1), which rotates the tt fields among themselves. Referring to Fig. 11.2, 
we note that the massive a field describes oscillations of <p ! in the radial 
direction, in which the potential has a nonvanishing second derivative. The 
massless 7r fields describe oscillations of ft in the tangential directions, along 
the trough of the potential. The trough is an {N’— l)-dimensional surface, and 
all N— 1 directions are equivalent, reflecting the unbroken 0(N— 1) symmetry. 
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Goldstone’s Theorem 

The appearance of massless particles when a continuous symmetry is spon¬ 
taneously broken is a general result, known as Goldstone’s theorem. To state 
the theorem precisely, we must count the number of linearly independent con¬ 
tinuous symmetry transformations. In the linear sigma model, there are no 
continuous symmetries for N = 1, while for N = 2 there is a single direction of 
rotation. A rotation in N dimensions can be in any one of N(N— 1)/2 planes, 
so the 0(lV)-symmetric theory has A r (N—l)/'2 continuous symmetries. After 
spontaneous symmetry breaking there are (N—1)(N— 2)/2 remaining symme¬ 
tries, corresponding to rotations of the (A T — 1) ir fields. The number of broken 
symmetries is the difference, N— 1. 

Goldstone’s theorem states that for every spontaneously broken continu¬ 
ous symmetry, the theory must contain a massless particle.^ We have just seen 
that this theorem holds in the linear sigma model, at least at the classical level. 
The massless fields that arise through spontaneous symmetry breaking are 
called Goldstone bosons. Many light bosons seen in physics, such as the pions, 
may be interpreted (at least approximately) as Goldstone bosons. We conclude 
this section with a general proof of Goldstone’s theorem for classical scalar 
field theories. The rest of this chapter is devoted to the quantum-mechanical 
analysis of theories with hidden symmetry. By the end of the chapter we will 
see that Goldstone bosons cannot acquire mass from any order of quantum 
corrections. 

Consider, then, a theory involving several fields (jf{: r), with a Lagrangian 
of the form 

£ = (terms with derivatives) — V((/>). (11.10) 


Let <j >o be a constant field that minimizes \ r , so that 

— V 

d( t> a *«<*>=*S 

Expanding V about this minimum, we find 


= 0. 


1 


v(4>) = v(M + §(* - M a (4> - M b ydrd(j)b . )a 


d 2 


V | + 

0 


The coefficient of the quadratic term, 

d- 


<>o "Or'' 


V = m 


ab ’ 


(li.ii) 


ij. Goldstone, Nuovo Cim. 19, 154 (1961). An instructive four-page paper by 
J. Goldstone, A. Salam, and S. Weinberg, Phvs. Rev. 127, 965 (1962), gives three 
different proofs of the theorem. 
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is a symmetric matrix whose eigenvalues give the masses of the fields. These 
eigenvalues cannot be negative, since <po is a minimum. To prove Gold- 
stone’s theorem, we must show that every continuous symmetry of the La- 
grangian (11.10) that is not a symmetry of (po gives rise to a zero eigenvalue 
of this mass matrix. 

A general continuous symmetry transformation has the form 

<p a —► <p a +aA a {0), (11.12) 


where a is an infinitesimal parameter and A 0 is some function of all the (p’s. 
Specialize to constant fields; then the derivative terms in C vanish and the 
potential alone must be invariant under (11.12). This condition can be written 

vm = V{r + aA a (<p)) or A a (<j>) = 0. 


Now differentiate with respect to (p b , and set <j> = <pQ\ 


I'd. A“\ ( dV 

- [wLXw 


+ A a i4> 0 ) 


g-2 

d<p a d<p bl Uo 


(11.13) 


The first term vanishes since cpo is a minimum of V, so the second term must 
also vanish. If the transformation leaves (po unchanged (i.e., if the symmetry is 
respected by the ground state), then A a (<po) = 0 and this relation is trivial. A 
spontaneously broken symmetry is precisely one for which A a (<po) ^ 0; in this 
case A a (<p 0 ) is our desired vector with eigenvalue zero, so Goldstone’s theorem 
is proved. 


11.2 Renormalization and Symmetry: 

An Explicit Example 

Now let us investigate the quantum mechanics of a theory with spontaneously- 
broken symmetry. Again we will use as our example the linear sigma model. 
The Lagrangian of this theory, written in terms of shifted fields, is given in 
Eq. (11.9). From this expression, we can read off the Feynman rules; these are 
shown in Fig. 11.3. 

Using these Feynman rules, we can compute tree-level amplitudes without 
difficulty. Diagrams with loops, however, will often diverge. For the amplitude 
with N e external legs, the superficial degree of divergence is 

D = 4-N e , 

just as in the discussion of <p 4 theory in Section 10.2. (Diagrams containing 
a three-point vertex will be less divergent than this expression indicates, be¬ 
cause this vertex has a coefficient with dimensions of mass.) However, the 
symmetry constraints on the amplitudes are much weaker than in that earlier 
analysis. The linear sigma model has eight different superficially divergent am¬ 
plitudes (see Fig. 11.4); several of these have D > 0 and therefore can contain 
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Figure 11.3. Feynman rules for the linear sigma model. 


more than one infinite constant. Yet the number of bare parameters available 
to absorb these infinities is much smaller. If we follow the procedure of Sec¬ 
tion 10.2 to rewrite the original Lagrangian in terms of physical parameters 
and counterterms, we find only three counterterms: 

- \8M? ■ 

Written in terms of ir and a fields, the second line takes the form 

y (S^) 2 - \{8, + S x ir)(n k y 2 + 6 f(d,a)' 2 - ^ + 3d» 2 


(11.14) 


— (SfjiV + 6\v 6 )(j — S\va(7r k ) — 8\vcr 
2 f—k\2 4 

~ y Lw ) J - (7r > ~T 


(11.15) 


The Feynman rules associated with these counterterms are shown in Fig. 11.5. 
There are now plenty of counterterms, but they still depend on only three 
renormalization parameters: 8z, 8ft, and 8\. It would be a miracle if these 
three parameters were able to absorb all the infinities arising in the divergent 
amplitudes shown in Fig. 11.4. 

If this miracle did not occur, that is, if the counterterms of (11.15) did 
not absorb all the infinities, we could still make this theory renormalizable by 
introducing new, symmetry-breaking terms in the Lagrangian. These would 
give rise to additional counterterms, which could be adjusted to render all am¬ 
plitudes finite. If desired, we could set the physical values of the symmetry¬ 
breaking coupling constants to zero. The bare values of these constants, how¬ 
ever, would still be nonzero, so the Lagrangian itself would no longer be invari¬ 
ant under the O(N) symmetry. We would have to conclude that the symmetry 
is not consistent with quantum mechanics. 
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Figure 11.4. Divergent amplitudes in the linear sigma model. 


Figure 11.5. Feynman rules for counterterm vertices in the linear sigma 
model. 

Fortunately, the miracle does occur. We will see below that the counter¬ 
terms of (11.15), even though they contain only three adjustable parameters, 
are indeed sufficient to cancel all the infinities that occur in this theory. In 
this section we will demonstrate this cancellation explicitly at the one-loop 
level. The rest of this chapter is devoted to a more general discussion of these 
issues. 

Renormalization Conditions 

In the discussion to follow, we will keep track of only the divergent parts of 
Feynman diagrams. However, it will be useful to keep in mind a set of renor¬ 
malization conditions that could, in principle, be used to determine also the 
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finite parts of the counterterms. Since the counterterms contain three ad¬ 
justable parameters, we need three conditions. We could take these to be the 
conditions (10.19) (implemented according to (10.28)), specifying the phys¬ 
ical mass m of the a field, its field strength, and the scattering amplitude 
at threshold. However, it is technically easier to replace one of these condi¬ 
tions with a constraint on the one-point amplitude for a (the sum of tadpole 
diagrams): 


In QED the tadpole diagrams automatically vanish, as we saw in Eq. (10.5). 
In the linear sigma model, however, no symmetry forbids the appearance of a 
nonvanishing one-cr amplitude. This amplitude produces a vacuum expecta¬ 
tion value of a and so, since <j> N = v + a, shifts the vacuum expectation value 
of <p. Such a shift is quite acceptable, as long as it is finite after counterterms 
are properly added into the computation of the amplitude. However, it will 
simplify the bookkeeping to set up our conventions so that the relation 

<A> = ^ (11.16) 

is satisfied to all orders in perturbation theory. We will define A, as in 
Eq. (10.19), as the scattering amplitude at threshold. Then Eq. (11.16) de¬ 
fines the parameter //, so the mass m of the a field will differ from the result of 
the classical equations m 2 = 2/r = 2Xv' 2 by terms of order (A/r). If indeed we 
can remove the divergences from the theory by adjusting three counterterms, 
these corrections will be finite and constitute a prediction of the quantum 
field theory. 

To summarize, we will use the following renormalization conditions: 


(11.17) 


In the last condition, the circle is the amputated four-point amplitude. Note 
that the last two conditions depend on the physical mass m of the a particle. 
We must now show that these three conditions suffice to make all of the one- 
loop amplitudes of the linear sigma model finite. 
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The Vertex Counterterm 

We begin by determining the counterterm S\ by computing the 4a amplitude. 
The tree-level term comes from the 4 a vertex, and is just such as to satisfy 
(11.17). The one-loop contribution to this amplitude is the sum of diagrams: 


(11.18) 


According to (11.17), we must adjust S\ so that this sum of diagrams vanishes 
at threshold. In this calculation, we will only keep track of the ultraviolet di¬ 
vergences. This greatly simplifies the analysis, because most of the diagrams 
in (11.18) are finite. All the diagrams with loops made of three or more prop¬ 
agators are finite, since they have at least six powers of the loop momentum 
in the denominator; for example, 


r d: l k 1 1 1 
J (27r) 4 FFF' 


Alternatively, we can see that this diagram is finite in the following way: 
Each three-point vertex carries a factor of /i, which has dimensions of mass. 
According to the dimensional analysis argument of Section 10.1, each such 
factor lowers the degree of divergence of a diagram by 1. Since the 4a ampli¬ 
tude already has D = 0, any diagram containing a three-point vertex must be 
finite. 

We are left with the first two diagrams of (11.18) and the four diagrams re¬ 
lated to these by crossing. Let us evaluate the first diagram using dimensional 
regularization: 


1 , 2 f d d k * i 

= 2 ' (_ * > ' j (2^F k 2 - 2p 2 (k + p) 2 - 2p 2 

f , f d d k 1 

= 18A" / dx / —-—- 

./ ./ (2n) d [k 2 - A} 2 

o 
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l 

= i8A '7*(3^ r<M) (ir 4 

0 

. r(2 —£) 

= 18iA 2 ^ y + (finite terms). (11.19) 

Here A is a function of p and p, whose exact form does not concern us. Since 
our objective is only to demonstrate the cancellation of the divergences, we 
will neglect finite terms here and throughout the rest of this section. The 
second diagram of (11.18) (with 7r’s instead of tr’s for the internal lines) is 
identical, except that each vertex factor is changed from — 6-iA to —2*A<5 ,J . 
(Roman indices i,j,... run from 1 to N— 1.) We therefore have 

T (2 —-1 

= ‘2i\ 2 (N— 1) ^ , + (finite terms). (11.20) 


Since the infinite part of each of these diagrams is simply a momentum- 
independent constant, the infinite parts of the corresponding t- and w-channel 
diagrams must be identical. Therefore the infinite part of the 4er vertex is just 
three times the sum of (11.19) and (11.20): 


r^i 


6iA 2 (iV+8) 


r(2-f) 

(4tt) 2 


( 11 . 21 ) 


(In this section we use the ~ symbol to indicate equality up to omitted fi¬ 
nite corrections.) Applying the third condition of (11.17), we find that the 
counterterm S\ is given by 

T( 2—-1 

A 2 (A + 8 )^_^. (11.22) 

Once we have determined the value of S \, we have fixed the counterterms 
for the two other four-point amplitudes. Are these amplitudes also made finite? 
Consider the amplitude with two a’s and two ir% This receives one-loop 
corrections from 


(11.23) 


and from several diagrams with three-point vertices which, as argued earlier, 
are manifestly finite. Each of the diagrams in (11.23) contains a loop integral 
analogous to that in (11.19), whose infinite part is always —*T(2—4)/(47 t) 2 . 
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The only differences are in the vertices and symmetry factors. For example, 
the infinite part of the first diagram of (11.23) is 

~ 3 • (-60)1-2,:A3«). ^nM) = 

The second diagram is a bit more complicated: 

~ ^ • {-2i\6 kl )(-2i\(5 ij 6 kl + S ik S jl + S il S Jk )) • ■^ J r(2-|) 

= 2i\ 2 (N+l)6 ij T ^-i\ 

(4tt) 2 

In the third diagram there is no symmetry factor: 

~ (-2Uj")(-2Uji‘). ^r(2-|) = 


The fourth diagram of (11.23) gives an identical expression, since it is the 
same as the third but with i and j interchanged. The sum of the four diagrams 
therefore gives, for the infinite part of the crcnrir vertex, 

r ( 2 —-1 

~ 2 i\ 2 S iJ (N+ 8) ( . (11.24) 

This divergent term is indeed canceled by the cramr counterterm, with the 
value of S\ given in (11.22). 

The remaining four-point amplitude has four external ir fields. The diver¬ 
gent one-loop diagrams are: 

(11.25) 

These diagrams all have the same familiar form. The first is 

~ \ ■ (—2iAS ij )( —2i\S kl ) ■ ^T(2-l) = 2iX 2 S^S kl ^-p-. 
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The second diagram is more complicated: 


~ ^ • (-2 \iX(8 ij 5 mn +5 im 5 in +5 in S im )) 

■ (-2iX(5 kl 5 mn +6 km 5 ln +5 kn 6 lm )) • -^-T(2-|) 
v ' (47r) 2 

r(2 —-) 

= 2/A- ((N+3)S ij S kl + 2 8 ik 8 jl + 2S il 6 jk ) ( v g . 

For each of these diagrams there are two corresponding cross-channel dia¬ 
grams, which differ only in the ways that the external indices ijkl are paired 
together. For instance, the t-channel diagrams are identical to the s-channel 
diagrams, but with j and k interchanged. Adding all six diagrams, we find for 
the 47T vertex 


r^i 


2/A" (, S ij 6 kl +S ik S jl +S a S jk ) (N+8) 


r(2-f) 

(4tt) 2 


(11.26) 


Again, the value of <5* given in (11.22) gives a counterterm of the correct value 
and index structure to cancel this divergence. 

The value of 5\ that we have determined also fixes the counterterms for 
the three-point amplitudes. Thus we have no further freedom in canceling the 
divergences in the three-point amplitudes; we can only cross our fingers and 
hope these also come out finite. The 3(7 amplitude is given by 


(11.27) 


The diagrams made of three three-point vertices are finite and play no role in 
the cancellation of divergences. Of the divergent diagrams in (11.27), the first 
has the form 


1 

2 


(—6iX)(—6iXv) 


/ 


d d k 


{2n) d k 2 - 2// 2 (k + p) 2 - 2p 2 




!8iX 2 v 


r(2-f) 

(4tt) 2 


This is exactly the same as the corresponding diagram (11.19) for the do- 
vertex, except for the extra factor of v. The same is true of the other five 
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divergent diagrams; thus, 




6iA 2 v(N+8) 


r(2-§) 

(47r) 2 


(11.28) 


This is precisely canceled by the 3a counterterm vertex in Fig. 11.5, with S\ 
given by (11.22). 

There is a similar correspondence between the airir amplitude and the 
aann amplitude. The four divergent diagrams in the ann amplitude are iden¬ 
tical to those in (11.23), except that each has an external a leg replaced by a 
factor of v. Referring to the airir counterterm vertex in Fig. 11.5, we see that 
the cancellation of divergences will occur here as well. 

What is happening? All the divergences we have seen so far are manifes¬ 
tations of the basic diagram 


(11.29) 


with either four external particles or with one leg set to zero momentum and 
associated with the vacuum expectation value of cp. Since the O(N) symmetry 
is broken, this diagram manifests itself in many different ways. But apparently, 
the divergent part of the diagram is unaffected by the symmetry breaking. 

Two-Point and One-Point Amplitudes 

To complete our investigation of the one-loop structure of this theory we 
must evaluate the two-point and one-point amplitudes. We first determine 
the counterterm S M by applying the first renormalization condition in (11.17). 
At one-loop order, this condition reads 


(11.30) 


We will later need to make use of the finite part of the counterterm, so we will 
pay attention to the finite terms when we evaluate (11.30). The first diagram 
is 


1 

2 


(— 6iAv) 


r d d k 
J (‘2n) d k 2 



= —3iAv 


r(i-f) / i y-f 

(47r) d / 2 \2p 2 / 


(11.31) 

The second diagram involves a divergent integral over a massless propagator. 
To be sure that we understand how to treat this term, we will add a small 
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mass ( for the n field as an infrared regulator. Then the second diagram is 



d d k it)'-' 
(2n) d k 2 — C 2 


= -i(N-l)Xv 


Hi— I) (i y-f 

( 4 ir ) d / 2 \£ 2 / 


(11.32) 


Notice that, for d > 2, the diagram vanishes in the limit as ( ► 0; however, 

it has a pole at d = 2. Despite these strange features, we can add (11.32) to 
(11.31) and impose the condition that the tadpole diagrams be canceled by 
the counterterm from Fig. 11.5. This condition gives 

{S^+ v (5 A ) = -A ( 47r ) d /o (( 2/y 2)i-d/2 + (£2)! —d/2 )' (11-33) 


Now consider the 2<r amplitude. The one-particle-irreducible amplitude 
receives contributions from four one-loop diagrams and a counterterm: 


(11.34) 

It is convenient to write the counterterm vertex as 

-i(‘2v 2 6\) - i(S f i + v 2 6\) - ip 2 S z ■ (11.35) 

In a general renormalization scheme, the a mass will also be shifted by the 
tadpole diagrams (and their counterterm): 


(11.36) 


However, the first renormalization condition in (11.17) forces these diagrams 
to cancel precisely. This is an example of the special simplicity of this renor¬ 
malization condition. 

The first two diagrams are again manifestations of the generic four-point 
diagram (11.29), now with two external legs replaced by the vacuum expec¬ 
tation value of <p. In analogy with the preceding calculations, we find for the 
first diagram 




18 -iAV 


r(2-f) 

(4tt) 2 


and for the second diagram 


Using (11.22), we see that these two contributions are canceled by the first 
term of (11.35). The third and fourth diagrams of (11.34) contain precisely 
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the same integrals as the tadpole diagrams of (11.30). Relation (11.33) implies 
that they are canceled by the second term in (11.35). Notice that there is no 
divergent term proportional to p 2 in any of the one-loop diagrams of (11.34). 
Thus the renormalization constant Sz is finite at the one-loop level, just as in 
ordinary cp' 1 theory. 

There remains only one potentially divergent amplitude—the 7T7T ampli¬ 
tude: 


(11.37) 


In analogy with (11.31), the first diagram is 


= \(-2iXS lj ) 


r d d k 

I ( 2n) d k 2 


2p 2 


= nizi) 

(4 ny/ 2 



The second diagram is quite similar. As in (11.32), it is useful to introduce a 
small pion mass as an infrared regulator. 


1 

2 


/ . fjdu 1 

k , _ , 


= -i\(N+l)S 


f l-r' - 1 ) 


d 

2 


The third diagram is given by 


= (-2i\v6 ik )(-2i\v6 kj ) 


d d k i i 

(2n) d k 2 - C 2 ( k+p ) 2 - V 


= 4 

(47T )d/ 2 


1 

h 


2p. 2 x + (1— x)( 2 — p 2 x(l—x) 


2 — 7 


The divergent part of this expression is independent of p, so to check the 
cancellation of the divergence, it suffices to set p = 0. It will be instructive to 
compute the complete amplitude at p = 0, including the finite terms. Adding 
the three loop diagrams and the counterterm, whose value is given by (11.33), 
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we find 


= ,W r(1 -® 


p =0 


jniziif 

\ (47r) d / 2 \ 


+ 


IV+1 


(4ir) d /' 2 \(2/y, 2 ) 1 ~ d / 2 (C 2 ) 1 ”^ 2 


- 4Av 2 


r(2-f) 


(47r) d / 2 

r(i-f) 


i 

h 


i 


2p 2 x + £ 2 (1 — x) 
N- 


+ 


p,)l 


(47T) d / 2 \ (2/i 2 ) 1_d/2 (C 2 ) 1 

(11.38) 

It is not hard to simplify this expression. The first and third lines can be 
combined to give 

r(i-f) r 


2A S* J 


1 


1 


(47r) d / 2 L(C 2 ) 1_d/2 (2/x 2 ) 1-d / 2 J' 


Near d = 2 the quantity in brackets is proportional to 1 — d/2, and this factor 
cancels the pole in the gamma function. Thus the worst divergence cancels, 
leaving only a pole at d = 4. Using the identity T(x) = T(.t + l)/x, we can 
rewrite the above expression as 


‘2XS ij 


r(2-f) i 


C 2 


2fi 2 


(4ir)d/ 2 1—d/2 L(C 2 ) 2_d/2 (2fi 2 ) 2 ~ d / 2 \ 


(11.39) 


The first term vanishes for d > 2 and ( > 0, and can be neglected. Meanwhile, 
the second line of expression (11.38) involves the elementary integral 


j dx(2/j 2 x + (l-;r)C 2 )2 2 
o 


1 ( 2 p 2 ) rf / 2 “ 1 - ( C 2 )^ 2 ” 1 

d/2 - 1 ' 2/j 2 - ( 2 


This expression is also nonsingular at d = 2 and reduces to 


1 

d/2 - 1 


(2 fj 2 )^ 2 - 2 


for d > 2 and ( —> 0. Comparing this line with the remaining term from 
(11.39), and recalling that Av 2 = /r, we find that the irir amplitude is not 
only finite, but vanishes completely at p = 0. 

This result is very attractive. The irir amplitude, at p = 0, is precisely 
the mass shift Sm 2 of the ir field. We already knew that the ir particles are 
massless at tree level—they are the N— 1 massless bosons required by Gold- 
stone’s theorem. We have now verified that these bosons remain massless at 
the one-loop level in the linear sigma model; in other words, the first quan¬ 
tum corrections to the linear sigma model also respect Goldstone’s theorem. 
At the end of this chapter, we will give a general argument that Goldstone’s 
theorem is satisfied to all orders in perturbation theory. 
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11.3 The Effective Action 

In the first section of this chapter, we analyzed spontaneous symmetry break¬ 
ing in classical field theory. That analysis was geometrical: We found the vac¬ 
uum state by finding the deepest well in a potential surface, and we proved 
Goldstone’s theorem by showing that symmetry required the presence of a line 
of degenerate minima at the bottom of the well. But this geometrical picture 
was lost, or at least disguised, in the one-loop calculations of Section 11.2. It 
seems worthwhile to develop a formalism that will allow us to use geometrical 
arguments about spontaneous symmetry breaking even at the quantum level. 

To define our goal somewhat better, consider the problem of determining 
the vacuum expectation value of the quantum field <f>. This expectation value 
should be determined as a function of the parameters of the Lagrangian. At 
the classical level, it is easy to compute {(j>); one minimizes the potential 
energy. However, as we have seen in the previous section, this classical value 
can be altered by perturbative loop corrections. In fact, we saw that (<p) could 
be shifted by a potentially divergent quantity, which we needed to control by 
renormalization. 

It would be wonderful if, in the full quantum field theory, there were a 
function whose minimum gave the exact value of {(f). This function would 
agree with the classical potential energy to lowest order in perturbation the¬ 
ory, but it would be modified in higher orders by quantum corrections. Pre¬ 
sumably, these corrections would need renormalization to remove infinities. 
Nevertheless, after renormalization, this quantity should give the same rela¬ 
tions between (<p) and particle masses and couplings that we would find by 
direct Feynman diagram calculations. In this section, we will exhibit a func¬ 
tion with these properties, called the effective potential. In Section 11.4 we 
will explain how to compute the effective potential in perturbation theory, in 
terms of renormalized masses and couplings. Then we will go on to use it as 
a tool in analyzing the renormalizability of theories with hidden symmetry. 

To identify the effective potential, consider the analogy between quantum 
field theory and statistical mechanics set out in Section 9.3. In that section, 
we derived a correspondence between the correlation functions of a quantum 
field and those of a related statistical system, with quantum fluctuations being 
replaced by thermal fluctuations. At zero temperature the thermodynamic 
ground state is the state of lowest energy, but at nonzero temperature we 
still have a geometrical picture of the preferred thermodynamic state: It is 
the state that minimizes the Gibbs free energy. More explicitly, taking the 
example of a magnetic system, one defines the Helmholtz free energy F(H) 

by 

Z(H ) = e ~ 0F(H) = jVs exp[-/3 Jdx ( H[s] - Hs{x ))], (11.40) 

where H is the external magnetic field, V\s ] is the spin energy density, and 
/3 = 1/kT. We can find the magnetization of the system by differentiating 
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F(H): 


dF 

dH 


1 d 


log Z 


P fixed 0 8H- 

= i Jclx J Vs s(x) exp j3 J dx (H[s] — Hs) j (11.41) 


|,(x <,(,)) 


= M. 


The Gibbs free energy G is defined by the Legendre transformation 

G = F + MH , 


so that it satisfies 


dG 
d M 


dF , r dtf TT 
dM +M dM + 

™?l +m ™ +h 

dM dll dM 


= H 


(11.42) 


(where all partial derivatives are taken with f3 fixed). If H = 0, the Gibbs free 
energy reaches an extremum at the corresponding value of M. The thermo¬ 
dynamically most stable state is the minimum of G(M). Thus the function 
G(M) gives a picture of the preferred thermodynamic state that is geometrical 
and at the same time includes all effects of thermal fluctuations. 

By analogy, we can construct a similar quantity in a quantum field theory. 
For simplicity, we will work in this section only with a theory of one scalar 
field. All of the results generalize straightforwardly to systems with multiple 
scalar, spinor, and vector fields. 

Consider a quantum field theory of a scalar field cp, in the presence of an 
external source J. As in Chapter 9, it is useful to take the external source to 
depend on x. Thus, we define an energy functional E[.J] by 

Z[J\ = e~ iE ^ = JVoi'x V [i j,1'x {C[<p] + J(p)\. (11.43) 

The right-hand side of this equation is the functional integral representation 
of the amplitude (fi| e~ lHT |0), where T is the time extent of the functional 
integration, in the presence of the source J. Thus, E[J] is just the vacuum 
energy as a function of the external source. The functional E[J] is the analogue 
of the Helmholtz free energy, and J is the analogue of the external magnetic 
field. 

In principle, we could now Legendre-transform E[J] with respect to a 
constant value of the source. However, since we have already developed a 
formalism for functional integration and differentiation, it will not be much 
more difficult to work with an external source J(x) that depends on x in an 
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arbitrary way. As we will see, this generalization yields additional relations 
which connect this formalism to our general study of renormalization theory.+ 
Consider, then, the functional derivative of E[.J] with respect to J(x ): 


S 

SJ(x) 


E[J] 


. 8 

SJ(x) 


log Z 


fV<f>e i J ( - C+Jlp) 4>(x) 

fV<pe i f <£+J ‘ p) 


(11.44) 


We abbreviate this relation as 

= \<f>(x)\n)j; (11.45) 


the right-hand side is the vacuum expectation value in the presence of a 
nonzero source J(x). This relation is a functional analogue of Eq. (11.41): 
The functional derivative of E[J] gives the expectation value of <j> in the pres¬ 
ence of the spatially varying source. We should treat this expectation value as 
the thermodynamic variable conjugate to -J(x). Thus we define the quantity 
<p c i(x), called the classical field, by 


<t>d{x) = (n| 4>{x) |n)j. 


(11.46) 


The classical field is related to <f>{x) in the same way that the magnetization M 
is related to the local spin field s(x): It is a weighted average over all possible 
fluctuations. Note that <j> c i{x) depends on the external source J{x ), just as M 
depends on E[. 

Now, in analogy with the construction of the Gibbs free energy, define the 
Legendre transform of E[J ]: 

n<t> ci] = ~E[J] - I d 4 y •/(//jo,;(//). (11.47) 


This quantity is known as the effective action. In analogy with Eq. (11.42), 
we can now compute 


S(f> d(x) 


r[4i] = - 


S<f> ci (a?) 


E[J\- J d 4 y 4 Pj^My)-J(x) 


S<f> d(x) 


f j4 SJ(y) SE[J] f , 4 SJ(y) , , , T , , 

-./ d y um im “./ d v —^ {y) ~ 


S(t>cl(x) 


= —J(x). 


(11.48) 


In the last step we have used Eq. (11.45). 

For each of the thermodynamic quantities discussed at the beginning of 
this section, we have now defined an analogous quantity in quantum field 
theory. Table 11.1 summarizes these analogies. 


I This functional generalization of thermodynamics is due to C. DeDominicis and 
P. Martin, J. Math. Phvs. 5, 14 (1964), and was formulated for relativistic field theory 
by G. Jona-Lasinio, Nuovo Cim. 34A, 1790 (1964). 
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Magnetic System 

Quantum Field Theory 

X 

x = (t, x) 

s(x) 

<t>(x) 

H 

■J(x) 

ms) 

m 

Z(H) 

Z[J\ 

F(H) 

E[J] 

M 

Pclb'-I 

G(M) 

l[o,i] 


Table 11.1. Analogous quantities in a magnetic system and a scalar quantum 
field theory. 

Relation (11.48) implies that, if the external source is set to zero, the 
effective action satisfies the equation 

= °- (1L49 > 

The solutions to this equation are the values of (. <f>{x)) in the stable quantum 
states of the theory. For a translation-invariant vacuum state, we will find a 
solution in which <p c i is independent of x. Sometimes, Eq. (11.49) will have 
additional solutions, corresponding to localized lumps of field held together 
by their self-interaction. In these states, called solitons , the solution 4> c \(x) 
depends on x. 

^,From here on we will assume, for the field theories we consider, that the 
possible vacuum states are invariant under translations and Lorentz transfor¬ 
mations.* Then, for each possible vacuum state, the corresponding solution 
4> c i will be a constant, independent of x, and the process of solving Eq. (11.49) 
reduces to that of solving an ordinary equation of one variable (<f> c i). Further¬ 
more, we know that 1 is, in thermodynamic terms, an extensive quantity: It 
is proportional to the volume of the spacetime region over which the func¬ 
tional integral is taken. If T is the time extent of this region and V is its 
three-dimensional volume, we can write 

r[<£ c i] = ~(VT) ■ (11.50) 

The coefficient U e ff is called the effective potential. The condition that r[<p c i] 
has an extremum then reduces to the simple equation 

S-V eS (<l> c i) = 0. (11.51) 

Of} cl 


*Certain condensed matter systems have ground states with preferred orientation; 
see, for example P. G. de Gennes, The Physics of Liquid Crystals (Oxford University- 
Press, 1974). 
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Each solution of Eq. (11.51) is a translation-invariant state with J = 0. Equa¬ 
tion (11.47) implies that T = —E in this case, and therefore that V e ff((p c i), 
evaluated at a solution to (11.51), is just the energy density of the correspond¬ 
ing state. 

Figure 11.6 illustrates one possible shape for the function V e s The 
local maxima (or, for systems of several fields 0 % possible saddle points) are 
unstable configurations that cannot be realized as stationary states. The figure 
also contains a local minimum of V e s that is not the absolute minimum; this is 
a metastable vacuum state, which can decay to the true vacuum by quantum- 
mechanical tunneling. The absolute minimum of V e g is the state of lowest 
energy in the theory, and thus the true, stable, vacuum state. A system with 
spontaneously broken symmetry will have several minima of V e g, all with the 
same energy by virtue of the symmetry. The choice of one among these vacua 
is the spontaneous symmetry breaking. 

In drawing Fig. 11.6, we have assumed that we are computing the effec¬ 
tive potential for a fixed constant background value of dp. Under some circum¬ 
stances, this state does not give the true minimum energy configuration for 
states with a given expectation value of <p. This mismatch can occur in the 
following way: In a system for which the effective potential for constant back¬ 
ground fields is given by Fig. 11.6, consider choosing a value of <p c \ that is 
intermediate between the locally stable vacuum states <j>i and 4> 3 : 

<pci = x(f> 1 + (1 — x)(f) 3 , 0 < x < 1. (11.52) 

The assumption of a constant background field gives a large value of the 
effective potential, as indicated in the figure. We can obtain a lower-energy 
configuration by considering states with macroscopic regions in which ( <f >) = <pi 
and other regions in which (<p) = 0 3 , in such a way that the average value of 
(<p) over the whole system is <p c i. For such a configuration, the average vacuum 
energy is given by 

Ieff(4i) = xV eS {<pi) + (1 - x)V eS (hl (11.53) 

as shown in Fig. 11.7. We have called the left-hand side of this equation 
Ves(<f>ci) because the result (11.53) would be the result of an exact evaluation 
of the functional integral definition of F e ff for values of <p c \ satisfying (11.52). 
The interpolation (11.53) is the field theoretic analogue of the Maxwell con¬ 
struction for the thermodynamic free energy. In general, for any (f> c 1 , <f>\, <f >3 
satisfying (11.52), the estimate (11.53) will be an upper bound to the effective 
potential; we say that the effective potential is a convex function of 4> c \^ 

Just as in thermodynamics, straightforward schemes for computing the 
effective potential do not take account of the possibility of phase separation 
and so lead to a structure of unstable and metastable configurations of the 

iThe convexity of the Gibbs free energy is a well-known exact result in statisti¬ 
cal mechanics; see, for example, D. Ruelle, Statistical Mechanics (W. A. Benjamin, 
Reading, Mass., 1969). 



11.3 


Tlie Effective Action 


369 


Figure 11.6. A possible form for tlie effective potential in a scalar field the¬ 
ory. The extrema of the effective potential occur at the points <p c \ = <fii, (fin, <t>3- 
The true vacuum state is the one corresponding to <f >The state <j >2 is unsta¬ 
ble. The state (!)3 is metastable, but it can decay to <j>i by quantum-mechanical 
tunneling. 


Figure 11.7. Exact convex form of the effective potential for the system of 
Fig. 11.6. 

type shown in Fig. 11.6. The Maxwell construction must be performed by 
hand to yield the final form of V e ff(<p c i). Fortunately, the absolute minimum 
of V e ff is not affected by this nicety. 

We have now solved the problem that we posed at the beginning of this 
section: The effective potential, defined by Eqs. (11.47) and (11.50), gives an 
easily visualized function whose minimization defines the exact vacuum state 
of the quantum field theory, including all effects of quantum corrections. It is 
not obvious from these definitions how to compute V e s{4>c\)- We will see how 
to do so in the next section, by direct evaluation of the functional integral. 
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11.4 Computation of the Effective Action 

Now that we have defined the object whose minimization gives the exact 
vacuum state of a quantum field theory, we must learn how to compute it. 
This can be done in more than one way. The simplest method, which we will 
use here, requires that we be bold enough to evaluate the complete effective 
action T directly from its functional integral definition. After computing T, 
we can obtain V e ff by specializing to constant values of <p c 

Our plan is to find a perturbation expansion for the generating functional 
Z, starting with its functional integral definition (11.43). We will then take the 
logarithm to obtain the energy functional E, and finally Legendre-transform 
according to Eq. (11.47) to obtain T. We will use renormalized perturbation 
theory, so it is convenient to split the Lagrangian as we did in Eq. (10.18), 
into a piece depending on renormalized parameters and one containing the 
counterterms: 

£, = Ci + SC. (11.54) 

We wish to compute 1 as a function of (p c \. But the functional Z[J\ depends 
on <p c i through its dependence on J. Thus, we must find, at least implicitly, a 
relation between J(x ) and ep c \(x). At the lowest order in perturbation theory, 
that relation is just the classical field equation: 

+ J(x) = 0 (to lowest order). 

®=d ci 

Let us define J\{x) to be whatever function satisfies this equation exactly, 
when C = C\\ 

+ Ji(;r)=0 (exactly). (11.55) 

<P = <t> Cl 

We will think of the difference between J and J\ as a counterterm, analogous 
to SC, so we write 

J(x) = Mx) +SJ(x), (11.56) 


SCi 

8<j) 


SC 

Sep 


where SJ is determined, order by order in perturbation theory, by the original 
definition (11.46) of <p c i, namely {(p{x)) j = <p c i(x). 

Using this notation, we rewrite Eq. (11.43) as 

e ~iE[J] 

The second exponential contains the counterterms; leave this aside for the 
moment. In the first exponential, expand the exponent about (p c \ by replacing 


(11.57) 


+ Tliis method is due to R. Jackiw, Phvs. Rev. D9, 1686 (1974). 
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4>{x) = <j> ci(:c) + y(x). This exponent takes the form 


Jd 4 x(£ 1 + Ji4>) = Jd 4 x(£ i[^ c i] + J\<t>c\) + Jd'x '/(•'■> + J\ ) 

\ J d 4 xd 4 y ii{x)ii{y)-< S Cl 


t)o(.r)So(y) 


^ J d 4 x d 4 y d 4 z i){x) /? (y)f](z) 


S 3 C i 


64>(x)64>(y)6(t>(z) 


+ ■ 


(11.58) 

where the various functional derivatives of C\ are evaluated at (f> c i(x). Notice 
that the term linear in y vanishes by the use of Eq. (11.55). The integral 
over y is thus a Gaussian integral, with the cubic and higher terms giving 
perturbative corrections. 

We will describe a formal evaluation of this integral, following the prescrip¬ 
tions of Section 9.2. The ingredients in this evaluation will be the coefficients 
of Eq. (11.58), that is, the successive functional derivatives of C\. For the mo¬ 
ment, please accept that these give well-defined operators. After presenting a 
general expression for r[q> c i] , we will carry out this calculation explicitly in a 
scalar field theory example. We will see in this example that the formal oper¬ 
ators correspond to expressions familiar from Feynman diagram perturbation 
theory. 

Let us, then, consider performing the integral over y(x) using the expan¬ 
sion (11.58). Keeping only the terms up to quadratic order in and still 
neglecting the counterterms, we have a pure Gaussian integral, which can be 
evaluated in terms of a functional determinant: 


/ 


T)rj exp 


= exp 




(11.59) 


This functional determinant will give us the lowest-order quantum correction 
to the effective action, and for many purposes it is unnecessary to go further 
in the expansion (11.58). Later we will see that if we do include the cubic 
and higher terms in /?, these produce a Feynman diagram expansion of the 
functional integral (11.57) in which the propagator is the operator inverse 



(11.60) 


and the vertices are the third and higher functional derivatives of C\. 

Finally, let us put back the effects of the second exponential in Eq. (11.57), 
that is, the counterterm Lagrangian. It is useful to expand this term about 
o o,.|. writing it as 


((5£[<p c i] + 6J(f> ci) + ((5£[d» c i + »j] - 8C[<p ci] + SJi]). 


(11.61) 
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The second term of (11.61) can be expanded as a Taylor series in /?; the 
successive terms give counterterm vertices which can be included in the afore¬ 
mentioned Feynman diagrams. The first term is a constant with respect to the 
functional integral over i), and therefore gives additional terms in the exponent 
of Eq. (11.59). 

Combining the integral (11.59) with the contributions from higher-order 
vertices and counterterms, one can obtain a complete expression for the func¬ 
tional integral (11.57). We will see in the example below that the Feynman 
diagrams representing the higher-order terms can be arranged to give the ex¬ 
ponential of the sum of connected diagrams. Thus one obtains the following 
expression for E[J\: 


—iE[J] = i Jd'xiC^cpci] + Ji0ci) - ^logdet[-|^] 

+ (connected diagrams) + i Jd 4 x{SC[(f) c i] + 8J<j> c 0- 


(11.62) 


From this equation, T follows directly: Using .Ji -F S.J = J and the Legendre 
transform (11.47), we find 


— i ■ (connected diagrams) + Jd 4 x8C[(f> c i], (11.63) 


• c 2 

I'K-l] = / d 4 x Ci [d> c i] + |logdet[--^-] 


Notice that there are no terms remaining that depend explicitly on J; thus, 
T is expressed as a function of (f> c i, as it should be. The Feynman diagrams 
contributing to r[<p c i] have no external lines, and the simplest ones turn out 
to have two loops. The lowest-order quantum correction to T is given by the 
functional determinant, and this term is all that we will make use of in this 
book. 

The last term of (11.63) provides a set of counterterms that can be used 
to satisfy the renormalization conditions on T and, in the process, to cancel di¬ 
vergences that appear in the evaluation of the functional determinant and the 
diagrams. We will show in the example below exactly how this cancellation 
works. The renormalization conditions will determine all of the counterterms 
in SC. However, the formalism we have constructed contains a new counter¬ 
term SJ. That coefficient is determined by the following special criterion: In 
Eq. (11.55), we set up our analysis in such a way that, at the leading order, 
{(f>) = (j) ci. Potentially, however, this relation could break down at higher or¬ 
ders: The quantity ( <f>) could receive additional contributions from Feynman 
diagrams that might shift it from the value <p c i. This will happen if there are 
nonzero tadpole diagrams that contribute to {if. But this amplitude also re¬ 
ceives a contribution from the counterterm (8Jnj) in (11.61). Thus we can 
maintain {if) = 0, and in the process determine S J to any order, by adjusting 
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S J to satisfy the diagrammatic equation 


(11.64) 

In practice, we will satisfy this condition by simply ignoring any one-particle- 
irreducible one-point diagram, since any such diagram will be canceled by 
adjustment of SJ. The removal of these tadpole diagrams, which we needed 
some effort to arrange in Section 11.2, is thus built in here as a natural part 
of the formalism. 


The Effective Action in the Linear Sigma Model 

In Eq. (11.63), we have given a complete, though not exactly transparent, 
evaluation of T[<y> c i]. Let us now clarify the meaning of this equation, and also 
put it to some good use, by computing T[d> c i] in the linear sigma model. We 
will see that the results that we obtained by brute-force perturbation theory 
in Section 11.2 emerge much more naturally from Eq. (11.63). 

We begin again with the Lagrangian (11.5): 

£ = \w? + - ^[(<p ! ) 2 ] 2 - (11-65) 


Expand about the classical field: = (j> 7 cl + if. Because we expect to find a 

translation-invariant vacuum state, we will specialize to the case of a constant 
classical field. This will simplify some elements of the calculation below. In 
particular, according to Eq. (11.50), the final result will be proportional to 
the four-dimensional volume (VT) of the functional intergration. When this 
dependence is factored out, we will obtain a well-defined intensive expression 
for the effective potential. In any event, after this simplification, (11.65) takes 
the form 


£ = cl ) 2 ] 2 + O ' 2 

+ kdrf)* + ± 




( 11 . 66 ) 


According to Eq. (11.63), we should drop the term linear in rj. 

/.From the terms quadratic in ?y, we can read off 

x2 r 

__ = + ,,V-' - A[(^) 2 ^ + 24#i]• (11-67) 


Notice that this object has the general form of a Klein-Gordon operator. To 
clarify this relation, let us orient the coordinates so that (j>’ cl points in the iVth 
direction, 


<Ph = (0,0,..., 0, <A c i), 


(11.68) 
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as we did in Eq. (11.7). Then the operator (11.67) is just equal to the Klein- 
Gordon operator (— d 2 — to 2 ), where 


■2 = I A f cl - f 2 acting on f,..., if 1 ; 

' \ 3A<p 2 j — fi 2 acting on if. 


(11.69) 


The functional determinant in Eq. (11.63) is the product of the determinants 
of these Klein-Gordon operators: 


det = [det(<9 2 + (A0 2 , - f))] N ^det)^ 2 + (3A^ 2 j - p 2 ))]. (11.70) 


It is not difficult to obtain an explicit form for the determinant of a Klein- 
Gordon operator. To begin, use the trick of Eq. (9.77) to write 

logdet(3 2 + to 2 ) = Trlog(3 2 + to 2 ). 


Now evaluate the trace of the operator as the sum of its eigenvalues: 


Trlog(<9 2 T to 2 ) = ]Tlog(-£: 2 +to 2 ) 
k 

/ r l4 l. 

— (11.71) 


In the second line, we have converted the sum over momenta to an integral. 
The factor (VT) is the four-dimensional volume of the functional integral; we 
have already noted that this is expected to appear as an overall factor in T[<^ c i]. 
This manipulation gives an integral that can be evaluated in dimensional 
regularization after a Wick rotation: 



log( — A: 2 + to 2 ) 


i j l^log (kl+m 2 ) 

. d I' d A k r . 1 
1 da J (27r) 4 ( k\ +TO 2 )" 


a=0 


,d_ ( 1 T(q - I ) 1 A 

3a\(47r) d / 2 T(o:) (m 2 ) a ~ d / 2 J 


. £Hf) i 

* (47r) d / 2 (m 2 )~ d / 2 


a =0 

(11.72) 


In the last line, we have used T(a) —y 1/a as a —» 0. Thus, 


1 

Wt) 


log det (d 2 + to 2 ) 



{nff/ 2 . 


(11.73) 


Using this result to evaluate the determinant in Eq. (11.63), and choosing 
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Figure 11.8. Feynman diagrams contributing to tlie evaluation of the effec¬ 
tive potential of the O(N) linear sigma model: (a) a diagram that is removed 
by (11.64); (b) the first nonzero diagrammatic corrections. 

the counterterm Lagrangian as in Eq. (11.14), we find 
Uff(0) - - 1(/ . 1 i] 

= -\w a + ft, 

- - 1)(A4 - + (3X4,;, -SY 12 ] 

+ 2^‘^d + (11.74) 

Here we have written drj as a shorthand for (<f>' ci ) 2 ■ Since the second line of 
this result is the leading radiative correction, we might expect that the result 
has the structure of a one-loop Feynman diagram. Indeed, we see that this 
expression contains Gamma functions and ultraviolet divergences similar to 
those that we found in the one-loop computations of Section 11.2. We will 
show below that this term in fact has exactly the same ultraviolet divergences 
that we found in Section 11.2. These divergences will be subtracted by the 
counterterms in the last line of Eq. (11.74). 

Since the computation of the determinant in Eq. (11.63) gives the effect of 
one-loop corrections, we might expect the Feynman diagrams that contribute 
to Eq. (11.63) to begin in two-loop order. We can see this explicitly for the 
case of the O(N) sigma model. The perturbation expansion described below 
Eq. (11.60) involves the propagator that is the inverse of Eq. (11.67): 

(if (k)if {-k)) = pr - "—(H-75) 

k" — nij 

where m'j is given by (11.69). The vertices are given by the terms of order if 
and if in the expansion of the Lagrangian. Combining these ingredients, we 
find that the leading Feynman diagrams contributing to the vacuum energy 
have the forms shown in Fig. 11.8. The diagram of Fig. 11.8(a) is actually 
canceled by the effects of the counterterm 8J, as shown in Eq. (11.64). Thus 
the leading diagrammatic contribution to the effective potential comes from 
the two-loop diagrams of Fig. 11.8(b). 
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The result (11.74) is manifestly 0(JV)-symmetric. From the question that 
we posed at the beginning of Section 11.2, we might have feared that this 
property would be destroyed when we compute radiative corrections about a 
state with spontaneously broken symmetry. But V e s{4>c\) is the function that 
we minimize to find the vacuum state, and so it should properly be sym¬ 
metric, even if the lowest-energy vacuum is asymmetric. In the formalism we 
have constructed here, there is no need to worry. Formula (11.63) is man¬ 
ifestly invariant, term by term, under the original O(N) symmetry of the 
Lagrangian. Thus we must necessarily have arrived at an 0(iV)-symmetric 
result for V ef[ {(f> c i). 

Before going on to determine 5^ and S\ precisely, we might first check 
that the counterterms in Eq. (11.74) are sufficient to make the expression for 
r[^ c i] finite. The factor T(—d/ 2) has poles at d = 0, 2,4. The pole at d = 0 is a 
constant, independent of <j> c i, and therefore without physical significance. The 
pole at d = 2 is an even quadratic polynomial in cp c \. The pole at d = 4 is an 
even quartic polynomial in < j) c \. Thus Eq. (11.74) becomes a finite expression 
in the limit d —> 2 if we set 

rd-^i 

Sn = -X(N + 2) 2 + finite. 

(4tt) 

The expression is finite as d 4 if we set 

I (2 d. ) 

6^ = -\fir{N + 2) ; + finite; 

(4 7T)“ 

F(2_—4 

S\ = X 2 (N -f 8) + finite. (11.76) 

(4 7T)- 

These expressions agree with our earlier results from Section 11.2, Eqs. (11.33) 
and (11.22), in the limits d —> 2 and d —> 4 respectively. 

The finite parts of S\ and S fl depend on the exact form of the renormal¬ 
ization conditions that are imposed. For example, in Section 11.2, we imposed 
the condition (11.16) that the vacuum expectation value of <j> equals fj/VX 
and the additional conditions in (11.17) on the scattering amplitude and field 
strength of the a. Condition (11.16) is readily expressed in terms of the effec¬ 
tive potential as 

|^,=,./yA) = o. 

Using the connection between derivatives of T and one-particle-irreducible 
amplitudes, we could write the other two conditions as Fourier transforms to 
momentum space of functional derivatives of r[<p c i]. In this way, it is possi¬ 
ble in principle to reconstruct the particular renormalization scheme used in 
Section 11.2. 

However, if we want to visualize the modification of the lowest-order re¬ 
sults that is induced by the quantum corrections, we can apply a renormaliza¬ 
tion scheme that can be implemented more easily. One such scheme, known as 
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minimal subtraction (MS), is simply to remove the (1/e) poles (for e = 4 — d) 
in potentially divergent quantities. Normally, though, these (1/e) poles are ac¬ 
companied by terms involving 7 and log( 47 r). It is convenient, and no more 
arbitrary, to subtract these terms as well. In this prescription, known as mod¬ 
ified minimal subtraction or MS (“em-ess-bar”), one replaces 

p( 2 — -) 1/2 \ 

—> (n.77) 

where M is an arbitrary mass parameter that we have introduced to make the 
final equation dimensionally correct. You should think of M as parametrizing 
a sequence of possible renormalization conditions. The MS renormalization 
scheme usually puts one-loop corrections in an especially simple form. The 
price of this simplicity is that it normally takes some effort to express physi¬ 
cally measurable quantities in terms of the parameters of the MS expression. 

To apply the MS renormalization prescription to (11.74), we need to 
expand the divergent terms in this equation in powers of e. As an example, 
consider the MS regularization of expression (11.73): 


F( 2 \ m 2 ) d/2 = _- 

(47 t Y '/' 2 — 1) (47 r ) d / 2 


r,2 -«,„r7= 


7 + l° g ( te )-l„g(m i ) + -) 

■ + i5y( _loElm2/ " 5)+ I)' (1L78) 


Modifying our result (11.74) in this way, we find 
VeS = ~ 

+ I (i^P ( (iV ~~ 1 ^ A ^ 1 “ / y2 ) 2 ( 1 °g[( A 4 - V 2 )/ m ' 2 ] ~ |) 

+ (3A4 - p 2 ) 2 (log[(3A4 - p 2 )/M 2 ] - I)) . (11.79) 

The effective potential is thus modified to be slightly steeper at large values 
of </> c i and more negative at smaller values, as shown in Fig. 11.9. For each 
set of values of p., A, and M , we can determine the preferred vacuum state 
by minimizing V e g((j)) with respect to (p c The correction to V e g is undefined 
when the arguments of the logarithms become negative, but fortunately the 
minima of V e g occur outside of this region, as is illustrated in the figure. 

Before going on, we would like to raise two questions about this expression 
for the effective potential. The problems that we will raise occur generically in 
quantum field theory calculations, but expression (11.79) provides a concrete 
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Figure 11.9. The effective potential for <j> A theory (N = 1), with quantum 
corrections included as in Eq. (11.79). The lighter-weight curve shows the 
classical potential energy, for comparison. 


illustration of these difficulties. Most of our discussion in the next two chapters 
will be devoted to building a formalism within which these questions can be 
answered. 

First, it is troubling that, while our classical Lagrangian contained only- 
two parameters, /i, and A, the result (11.79) depends on three parameters, of 
which one is the arbitrary mass scale M. A superficial reply to this complaint 
can be given as follows: Consider the change in V e s{4> c \) that results from 
changing the value of M 2 to M 2 + SM 2 . From the explicit form of (11.79), 
we can see that this change is compensated completely by shifting the values 
of fi, and A, according to 


A ^ A + (iV + 8 ) ' 
V 2 V 2 ~ + 2 ) 


SM 2 
M 2 ’ 
SM 2 
' M' 2 ' 


(11.80) 


Thus, a change in M 2 is completely equivalent to changes in the parameters 
p. and A. It is not clear, however, why this should be true or how this fact 
helps us understand the dependence of our formulae on M 2 . 

The second problem arises from the fact that the one-loop correction in 
Eq. (11.79) includes a logarithm that can become large enough to compensate 
the small coupling constant A. The problem is particularly clear in the limit 
/r —> 0; then Eq. (11.79) takes the form 


left = 7' Vi(' .V + 8)(log(A4/M 2 ) - I) + 9log3) 

= [A + ^ ((N + 8) (log(A4/M 2 ) - !) + 9 log 3 )]. (11.81) 
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Where is the minimum of this potential? If we take this expression at face 
value, we find that I effect) passes through zero when <p c \ reaches a very small 
value of order 


4 




M 2 

— •exp 


(4't) 2 
(N + 8)A 


and, near this point, attains a minimum with a nonzero value of <f> c \. But 
the zero occurs by the cancellation of the leading term against the quantum 
correction. In other words, perturbation theory breaks down completely before 
we can address the question of whether I eff(d> c i), for /r = 0, has a symmetry¬ 
breaking minimum. It seems that our present tools are quite inadequate to 
resolve this case. 

Although it is far from obvious, these two problems turn out to be related 
to each other. One of our major results in Chapter 12 will be an explanation of 
the interrelation of M 2 , A, and /r displayed in Eq. (11.80). Then, in Chapter 
13, we will use the insight we have gained from this analysis to solve completely 
the second problem of the appearance of large logarithms. Before beginning 
that study, however, there are a few issues we have yet to discuss in the more 
formal aspects of the renormalization of theories with spontaneously broken 
symmetry. 


11.5 The Effective Action as a Generating Functional 

Now that we have defined the effective action and computed it for one partic¬ 
ular theory, let us return to our goal of understanding the renormalization of 
theories with hidden symmetry. In Section 11.6 we will use the effective ac¬ 
tion as a tool in achieving this goal. First, however, we must investigate in 
more detail the relation between the effective action and Feynman diagrams. 

We saw in Section 9.2 that the functional derivatives of Z[J] with respect 
to J(x ) produce the correlation functions of the scalar field (see, for example, 
Eq. (9.35)). In other words, Z[J] is the generating functional of correlation 
functions. Our goal now is to show that r[d> c i] is also such a generating func¬ 
tional; specifically, it is the generating functional of one-particle-irreducible 
(1PI) correlation functions. Since the 1PI correlation functions figure promi¬ 
nently in the theory of renormalization, this result will be central in the dis¬ 
cussion of renormalization in the following section. 

To begin, let us consider the functional derivatives not of r[d> c i], but 
of E[J] = ilogZ[J]. The first derivative, given in Eq. (11.44), is precisely 
— (d>(;r)). The second derivative is 

= ~i [{4>(x)<f>(y)) - {<t>(x)) ( <p(y ))] • (11.82) 
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If we were to compute the term (<p(x)<p(y)) from Feynman diagrams, there 
would be two types of contributions: 


(11.83) 


where each circle corresponds to a sum of connected diagrams. The second 
term in the last line of Eq. (11.82) cancels the second, disconnected, term of 
(11.83). Thus the second derivative of E[J\ contains only those contributions 
to (<f>(x)<f>(y)) that come from connected Feynman diagrams. Let us call this 
object the connected correlator'. 

6 2 E[J} 


uixmv) = 

Similarly, the third functional derivative of E[J\ is 
S 3 E[J] 


(11.84) 


SJ (x)SJ (y)SJ (z) 


[{<t>(x)<t>(y)<t>(z)) - (<p{x)<p{y)) ((j>(z)) - (<p(x)<p(z)) (<t>(y)) 
~ (4>(y)<t>(z)) {<t>(x)) + 2 (<f>(x)) {<t>(y)) (o(z)) 


= (<?(.r)o(u)o(z)), 


(11.85) 


In each successive derivative of E[J\ all contributions cancel except for those 
from fully connected diagrams. The general formula for n derivatives is 


S n E[J] 


= (0 (tp(xi) ■ ■ ■ <f>(x n )) 


( 11 . 86 ) 


SJ(x 1)• • • SJ(x n ) 

We therefore refer to E[J] as the generating functional of connected correlation 
functions. 

So much for E[J\. Now what about the functional derivatives of the ef¬ 
fective action? Consider first the derivative of Eq. (11.48) with respect to 

s ST 

SJ(y) Scpdix) X V ' 

We can rewrite the left-hand side of this equation using the chain rule, to 
obtain 

4 6(f)ci(z) s 2 r 


S(x — y) = — I d 4 z 
= I d 4 z 


/• 


6.J(y) 64> c i(z)6<pd(x) 

s 2 e s 2 t 


SJ(y)SJ(z) 8<t> c \(z)8<t> c \(x) 


- ( 8 ~ E \ ( ** 3r ^ 
\8J8j) y . \5<t> c i6<fi c J, x 


(11.87) 


In the second line we have used Eq. (11.44). The last line is an abstract repre¬ 
sentation of the second line, where we think of each of the second derivatives as 
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an infinite-dimensional matrix, with the integral over z, represented by matrix 
multiplication. What we have shown is that these two matrices are inverses 
of each other: 


8 2 E \ _ / 8' 2 r y 1 

8J8JJ V<5^ c i<5^ci ) 


( 11 . 88 ) 


Now according to Eq. (11.84), the first of these matrices is —i times the 
connected two-point function, that is, the exact propagator of the field cp. Let 
us call this propagator D(x,y ): 

/ S^E \ 

UutfJM ) = (1L89 > 


We will therefore refer to the other matrix (times —i) as the inverse propaga¬ 
tor: 


8 2 T 


8<j> c i(x)6<j>ci(y)) 


= iD (x,y). 


(11.90) 


This provides an interpretation, of sorts, for the second functional derivative 
of the effective action. This interpretation becomes more concrete if we go 
to momentum space. On a translation-invariant vacuum state (one with <p c \ 
constant), the matrix D(x,y ) must be diagonal in momentum: 

D(x,y) = J i ’” i r '" />(/')• (11-91) 


We showed in Eq. (7.43) that the momentum-space propagator D(p) is a 
geometric series in one-particle-irreducible Feynman diagrams. The Fourier 
transform of D~ 1 (x, y) then gives the inverse propagator: 

D-^p) = -i(p 2 - m 2 - M 2 (p 2 )), (11.92) 


where M 2 (p) is the sum of one-particle-irreducible two-point diagrams. 

To evaluate higher derivatives of the effective action we again use the 
chain rule, 


8 

8.J(z ) 



8<t> c i(w) 8 

8J(z) 8<t> c i(w) 


i Jd 4 w D(z,w) 


8cp c \{w) ’ 


(11.93) 


together with the standard rule for differentiating matrix inverses: 

= (11.94) 

da da 

Applying these identities to Eq. (11.88), we find (with some abbreviated no¬ 
tation) 


8 3 E[J] 

SJJJySJ. 


. r , 4 , s ( <5 2 r v 

‘ J d ‘ W D =» (- 1 ) Jd‘uJd‘ v (-iD tu ) 
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r 

= ij d l u d v d 4 w D xu D yv D zw 5 ^ 5 ^ H cV ( 1L95 ) 

This relation is more clearly expressed diagrammatically. The left-hand side 
is the connected three-point function. If we extract exact propagators as in¬ 
dicated in (11.95), this decomposes as follows: 


In this picture, each dark gray circle represents the sum of connected diagrams, 
while the light gray circle on the right-hand side represents the third derivative 
of /I'fo,;]. We see that the third derivative of «r[^ c i] is just the connected 
correlation function with all three full propagators removed, that is, the one- 
particle-irreducible three-point function: 

iS 3 T 

By similar, if increasingly complicated, manipulations, one can derive the 
same relation for each successive derivative of T. For example, differentiating 
Eq. (11.95), we eventually find (using matrix notation with repeated indices 
implicitly integrated over) 


-i8 A E 


— D muD xtD yu.D - 

i8 3 r 


ct c-tctct ^ sw J - J xt J - J yu J - J zv 
uJ'tnUJ'y'OJ'iiOJ 2 


•;<5 4 r 




+ 


7 D 


is 3 r 


qr 8^8^84^ 


+ (t u) + (£■<-> v) 


Since the left-hand side of this equation is the connected four-point function, 
we can rewrite it diagrammatically as 
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As above, the dark gray circles represent the sum of connected diagrams, while 
the light gray circles represent i times various derivatives of T. Subtracting 
the last three terms from each side removes all one-particle reducible pieces 
from the connected four-point function and so identifies the fourth derivative 
of X as the one-particle-irreducible four-point function. The general relation 
(for n > 3) is 


S n T[d> c i] 

<^cl(*l) ’ ' ' d(/!(:] ) 


-i{<j>{xi) •••^(%)) 1PI • 


(11.96) 


In other words, the effective action is the generating functional of one-particle- 
irreducible correlation functions. 

This conclusion implies that T contains the complete set of physical pre¬ 
dictions of the quantum field theory. Let us review how this information is 
encoded. The vacuum state of the field theory is identified as the minimum 
of the effective potential. The location of the minimum determines whether 
the symmetries of the Lagrangian are preserved or spontaneously broken. The 
second derivative of T is the inverse propagator. The poles of the propagator, 
or the zeros of the inverse propagator, give the values of the particle masses. 
Thus the particle masses m 2 are determined as the values of p 2 that solve the 
equation 

~ f a-2r 

D-Hp 2 ) = / d 4 xe ip < x - y) -—(x, y) = 0. (11.97) 

J (><pd<p 

The higher derivatives of T are the one-particle-irreducible amplitudes. These 
can be connected by full propagators and joined together to construct four- 
and higher-point connected amplitudes, which give the 5-matrix elements. 
Thus, from the knowledge of T, we can reconstruct the qualitative behavior 
of the quantum field theory, its pattern of symmetry-breaking, and then the 
quantitative details of its particles and their interactions. 


11.6 Renormalization and Symmetry: General Analysis 

In our analysis of the divergences of quantum field theories (especially in the 
paragraph below Eq. (10.4)), we noted that the basic divergences of Feyn¬ 
man integrals are associated with one-particle-irreducible diagrams. Thus we 
might expect that the effective action will be a useful object in discussing 
the renormalizability of quantum field theories, especially those with spon¬ 
taneously broken symmetry. In this section we will make use of the effective 
action in precisely this way. 

In Section 11.4, we saw in a particular example that the formalism for 
calculating the effective action provides the counterterms needed to remove 
the ultraviolet divergences, at least at the one-loop level. These counterterms 
were exactly those of the original Lagrangian. We will now argue that this 
set of counterterms is always sufficient—to all orders and for any renormaliz- 
able field theory—by applying the power-counting arguments of Section 10.1 
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directly to the computation of the effective action. We will use the language 
of scalar field theories, but the arguments can be generalized to theories of 
spinor and vector fields. 

Consider first the computation of the effective potential for constant (ay- 
independent) classical fields, in a field theory with an arbitrary number of 
fields 4 } l . The effective potential has mass dimension 4, so we expect that 
\ e«(<t>ci) will have divergent terms up to A 4 . To understand these divergences, 
expand V e g((f> c i) in a Taylor series: 

Veff(<Pcl) = A 0 + + A \ jkl (p'clVtiVcitpil + ■■■■ 

In theories without a symmetry <p' —»• — 4>\ there might also be terms lin¬ 
ear and cubic in <p'; we omit these for simplicity. The coefficients A 0 , ,4 2 , A 4 
have mass dimension, respectively, 4, 2, and 0; thus we expect them to con¬ 
tain A 4 , A 2 , and log A divergences, respectively. The power-counting analysis 
predicts that all higher terms in the Taylor series expansion should be finite. 
The constant term A 0 is independent of <p c p it has no physical significance. 
However, the divergences in ,4o and A 4 appear in physical quantities, since 
these coefficients enter the inverse propagator (11.90) and the irreducible four- 
point function (11.96) and therefore appear in the computation of 5-matrix 
elements. There is one further coefficient in the effective action that has non¬ 
negative mass dimension by power counting; this is the coefficient of the term 
quadratic in which appears when the effective action is evaluated for a 

nonconstant background field: 

AT[&i] = jd 4 x mTV/,5",^. (11.98) 

All other coefficients in the Taylor expansion of the effective action in powers 
of </. are finite by power counting. 

We can now argue that the counterterms of the original Lagrangian suffice 
to remove the divergences that might appear in the computation of T[<^ c i], 
The argument proceeds in two steps. We first use the BPHZ theorem to argue 
that the divergences of Green’s functions can be removed by adjusting a set 
of counterterms corresponding to the possible operators that can be added 
to the Lagrangian with coefficients of mass dimension greater than or equal 
to zero. The coefficients of these counterterms are in 1-to-l correspondence 
with the coefficients A 2 , A 4 , and fl> of the effective action. Next, we use the 
fact that the effective action is manifestly invariant to the original symmetry 
group of the model. This is true even if the vacuum state of the model has 
spontaneous symmetry breaking. This symmetry of the effective action follows 
from the analysis of Section 11.4, since the method we presented there for 
computing the effective action is manifestly invariant to the original symmetry 
of the Lagrangian. Combining these two results, we conclude that the effective 
action can always be made finite by adjusting the set of counterterms that 
are invariant to the original symmetry of the theory, even if this symmetry is 
spontaneously broken. By using the results of Section 11.5, which explain how 
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to construct the Green’s functions of the theory from the functional derivatives 
of the effective action, this conclusion of renormalizability extends to all the 
Green’s functions of the theory. 

To make this abstract argument more concrete, we will demonstrate in a 
simple example how the functional derivatives of the effective action yield a 
set of Feynman diagrams whose divergences correspond to symmetric coun¬ 
terterms. Let us, then, return once again to the 0{N )~invariant linear sigma 
model and compute the second functional derivative of l'[&, i]. If the whole for¬ 
malism we have constructed hangs together, we should be able to recognize 
the result as the Feynman diagram expansion of the inverse propagator, with 
divergences corresponding to the counterterms of 0(lV)-symmetric scalar field 
theory. 

To begin, we write out expression (11.63) explicitly for this model: 


n&i] = 


where 


j d 4 ;r( 


^W) 2 + ^ 2 (4) 2 -^((4)T 


+ - log det[— iV'i] + 


)• 

(11.99) 


-iV'J = - 


8 2 C 

8l >'8l >i 


- = + (A(of)(.»■}) 2 /r)A' ; . 2 Ao;,(11.100) 


For constant (p' cl , V‘- i is the operator that, acting on a given component of 
the scalar field, equals the Klein-Gordon operator with mass squared given by 
Eq. (11.69). This is the leading-order approximation to the inverse propagator 
of the linear sigma model. 

To find the higher-order corrrections to the inverse propagator, we must 
compute the second functional derivative of the quantum correction terms in 
r[0 c i]. From (11.99), we find 


S' 2 T 


8 2 C 


+ 


s 2 


■ logdetf— iV] + 


S(j>h{x)S(j> J cl {y ) S^^S^y) 8(p : cl {x)8(p J cl {y) 

The first term is just the Klein-Gordon operator iV‘- j 8{x — y). To compute 
the second term, use identity (9.77) for determinants of matrices: 


d d d 

— log det M ( a ) = —— tr log M (a) = tr M -1 ——. 
da da da 


( 11 . 101 ) 


Using this identity, we find 
i 8 


2 8<j>? 1 (z) 


log det[— iV] 


= i Tr 


A (of, + 4 x {z)V k + o!,(z)8 ik )(V 1 ;i). (11.102) 
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The quantity (T> 1 ) 1J ’(x, y) is the Klein-Gordon propagator. To differentiate 
a second time, we can use the identity (11.94); this yields 


S 2 


2 S(j>^{z)S(j> l cl {w) 


log det[— iD\ 


clV~t u ^clV 

= -A (8 ke S ij + S ik S jt + S i{ 6 jk )(V- 1 ) ij (z,z)S(z - w ) 

+ 2iX 2 ((f> k 1 (z)S ij + o:,(-.)6 jk + oi ] (z)d jk )(V 1 Y'" (z. ir) 

■ (4 l (z)S mn + (f>™(z)5 nt + o:! ] (z)d w, )cn (11.103) 


This is expected to be the formal correction to the inverse propagator at one- 
loop order, and indeed we can recognize in (11.103) the values of the one-loop 
diagrams 


Notice how, in this derivation, every functional derivative on X> -1 adds an¬ 
other propagator to the diagram and thus lowers the degree of divergence, in 
conformity with our general arguments in Section 10.1. 

This example illustrates that the successive functional derivatives of T[d> c i] 
are computed by a Feynman diagram expansion, with propagators and vertices 
that depend on the classical field. When the classical field is a constant, the 
propagators reduce to ordinary Klein-Gordon propagators and so the BPHZ 
theorem applies. All ultraviolet divergences can be removed from all of the 
amplitudes obtained by differentiating r[0 c i] by the use of the most general 
set of mass, vertex, and field-strength renormalizations. At the same time, the 
perturbation theory is manifestly invariant to the symmetry of the original 
Lagrangian, and so the only divergences that appear—and thus the only coun¬ 
terterms required—are those that respect this symmetry. In general, then, all 
amplitudes of a renormalizable theory of scalar fields invariant under a sym¬ 
metry group can be made finite using only the set of counterterms invariant 
to the symmetry. This gives a complete and quite satisfactory answer to the 
question posed at the beginning of Section 11.2. 

The computation of the effective action in spatially varying background 
fields has not been analyzed at the level of rigor involved in the proof of 
the BPHZ theorem. However, it is expected that in this situation also, the 
standard set of counterterms for the symmetric theory should suffice. We 
can argue this intuitively by using the fact that the ultraviolet divergences of 
Feynman diagrams are local in spacetime. Thus, to understand the divergences 
of a computation in a background 4> c i(x ) that is smoothly varying, we can 
divide spacetime into small boxes, in each of which <p c i(x) is approximately 
constant, and expand in the derivatives <9 # ,d> c i(;r). In this expansion in powers 
of d^cpdix), the Taylor series coefficients are functional derivatives of T in a 
constant background, which we know can be renormalized. The conclusion 
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of this intuitive argument has been checked at the two-loop level for several 
nontrivial background field configurations. 

Our general result on the renormalization of theories with spontaneously 
broken symmetry has an important implication for the physical predictions 
of these theories. In a renormalizable field theory, the most basic quantities 
of the theory cannot be predicted, because they are the quantities that must 
be specified as part of the definition of the theory. For example, in QED, the 
mass and charge of the electron must be adjusted from outside in order to 
define the theory. The predictions of QED are quantities that do not appear 
in the basic Lagrangian, for example, the anomalous magnetic moment of 
the electron. In renormalizable theories with spontaneously broken symmetry, 
however, the symmetry-breaking produces a large number of distinct masses 
and couplings, which depend on the relatively small number of parameters of 
the original symmetric theory. After the original parameters of the theory are 
fixed, any additional observable of the theory can be predicted unambiguously. 
For example, in the linear sigma model studied in this chapter, we took the 
values of the four-point coupling A and the vacuum expectation value (< p ) as 
input parameters; we then calculated the mass of the a particle in terms of 
these parameters in an unambiguous way. 

There is a general argument that implies that, once we fix the parame¬ 
ters of the Lagrangian, we must find an unambiguous, finite formula for the 
a mass in cp A theory, or, more generally, for any additional parameter of a 
renormalizable quantum field theory. In general, this parameter will be deter¬ 
mined at the classical level in terms of the couplings in the Lagrangian. For 
the example of the a mass in the linear sigma model, this classical relation is 

m — %/2A {(p) = 0, (11.104) 

where m is the mass of the a and A gives the four-</> scattering amplitude 
at threshold. In general, loop corrections will modify this relation, contribut¬ 
ing some nonzero expression to the right-hand side of this equation. How¬ 
ever, since Eq. (11.104) is valid at the classical level however the parameters 
of the Lagrangian are modified, it holds equally well when we add counter¬ 
terms to the Lagrangian and then adjust these counterterms order by order. 
Thus, the counterterms must give zero contributions to the right-hand side of 
Eq. (11.104). Therefore, the perturbative corrections to Eq. (11.104) must be 
automatically ultraviolet-finite. A relation of this type, true at the classical 
level for all values of the couplings in the Lagrangian, but corrected by loop 
effects, is called a zeroth-order natural relation. The argument we have given 
implies that, for any such relation, the loop corrections are finite and consti¬ 
tute predictions of the quantum field theory. We will see another example of 
such a relation in Problem 11.2. 
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Goldstone’s Theorem Revisited 

As a final application of the effective action formalism, let us return to the 
question of whether Goldstone’s theorem is valid in the presence of quantum 
corrections. Recall that we proved this theorem at the classical level at the end 
of Section 11.1: We showed in (11.13) that, if the Lagrangian has a continuous 
symmetry that is spontaneously broken, the matrix of second derivatives of 
the classical potential V(<p) has a corresponding zero eigenvalue. According 
to Eq. (11.11), this implies that the classical theory contains a massless scalar 
particle, associated with the spontaneously broken symmetry. 

Using the effective action formalism, this argument can be repeated al¬ 
most verbatim in the full quantum field theory. The effective potential V e s (<p c i) 
encapsulates the full solution to the theory, including all orders of quantum 
corrections. At the same time, it satisfies the general properties of the classi¬ 
cal potential: It is invariant to the symmetries of the theory, and its minimum 
gives the vacuum expectation value of cp. This means that the argument we 
gave in (11.13) works in exactly the same way for V e ff as it does for V: If a 
continuous symmetry of the original Lagrangian is spontaneously broken by 
( (f> ), the matrix of second derivatives of V e s{4>c\) has a zero eigenvalue along 
the symmetry direction. 

We now argue that, just as at the classical level, the presence of such a zero 
eigenvalue implies the existence of a massless scalar particle. In our discussion 
of the general properties of the effective action, we showed that its second 
functional derivative is the inverse propagator, and that, through Eq. (11.97), 
this derivative yields the spectrum of masses in the quantum theory. Let us 
rewrite Eq. (11.97) for a theory that contains several scalar fields: 

I d'x, = 0. (11.105) 

A particle of mass m corresponds to a zero eigenvalue of this matrix equation 
at p 2 = m 2 . Now set p = 0. This implies that we differentiate r[<p c i] with re¬ 
spect to constant fields. Thus, we can replace r[<p c i] by its value with constant 
classical fields, which is just the effective potential. We find that the quantum 
field theory contains a scalar particle of zero mass when the matrix of second 
derivatives, 

9 2 Veff 

has a zero eigenvalue. This completes the proof of Goldstone’s theorem. 

This argument for Goldstone’s theorem illustrates the power of the effec¬ 
tive action formalism. The formalism gives a geometrical picture of sponta¬ 
neous symmetry breaking that is valid to any order in quantum corrections. 
As a bonus, it is built up from objects that are renormalized in a simple way. 
This formalism will prove useful in understanding the applications of sponta¬ 
neously broken symmetry that occur, in several different contexts, throughout 
the rest of this book. 



Problems 389 


Problems 

11.1 Spin-wave theory. 

(a) Prove the following wonderful formula: Let <j>(x) be a free scalar field with prop¬ 
agator (T<f>[x)4>( 0)) = D(x). Then 

^ = e [£>(*)--D(0)]_ 

(The factor D( 0) gives a formally divergent adjustment of the overall normal¬ 
ization. ) 

(b) We can use this formula in Euclidean field theory to discuss correlation functions 
in a theory with spontaneously broken symmetry for T < Tq. Let us consider 
only the simplest case of a broken 0(2) or U( 1) symmetry. We can write the 
local spin density as a complex variable 

s(x) = I 1 (a:) + is 2 (x). 

The global symmetry is the transformation 

s(x) —S- e~ ia s(x). 

If we assume that the physics freezes the modulus of s(x), we can parametrize 

s(x) = Ae ,(p W 

and write an effective Lagrangian for the field <p(x). The symmetry of the theory 
becomes the translation symmetry 

(,b(x) —> <j>(x) — a. 

Show that (for d > 0) the most general renormalizable Lagrangian consistent 
with this symmetry is the free field theory 

C = hp(V<pj 2 . 

In statistical mechanics, the constant p is called the spin wave modulus. A rea¬ 
sonable hypothesis for p is that it is finite for T < T<y and tends to 0 as T -> T<y 
from below. 

(c) Compute the correlation function (s(x)s* (0)). Adjust A to give a physically sen¬ 
sible normalization (assuming that the system has a physical cutoff at the scale 
of one atomic spacing) and display the dependence of this correlation function 
on x for d = 1,2, 3,4. Explain the significance of your results. 

11.2 A zeroth-order natural relation. This problem studies an N = 2 linear 
sigma model coupled to fermions: 

£ = \(dp4> % ) 2 + p‘ 2 (4> 1 )' 2 - - W# 1 + *7 5 0 2 )V’ (i) 

where 0* is a two-component field, i = 1,2. 
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(a) Show that this theory has the following global symmetry: 

a) 1 —S- cos a iI) 1 — sin a <p 2 , 

<p 2 -5- sin Q' (p 1 + cos a <p 2 , (2) 

p’ —s- e~ iai I 2 ip. 

Show also that the solution to the classical equations of motion with the mini¬ 
mum energy breaks this symmetry spontaneously. 

(b) Denote the vacuum expectation value of the field <p l by v and make the change 
of variables 

f'(x) = (v + cr(x),ir(x)). ( 3 ) 

White out the Lagrangian in these new variables, and show that the fermion 
acquires a mass given by 

nif = g ■ v. (4) 


(c) Compute the one-loop radiative correction to nif , choosing renormalization con¬ 
ditions so that v and g (defined as the ip'ipit vertex at zero momentum transfer) 
receive no radiative corrections. Show that relation (4) receives nonzero correc¬ 
tions but that these corrections are finite. This is in accord with our general 
discussion in Section 11.6. 

11.3 The Gross-Neveu model. The Gross-Neveu model is a model in two spacetime 
dimensions of fermions with a discrete chiral symmetry: 

£ = 4’iipi’i + tj 


with i = 1,..., N. The kinetic term of two-dimensional fermions is built from matrices 
that satisfy the two-dimensional Dirac algebra. These matrices can be 2 x 2: 


where <r* are Pauli sigma matrices. Define 


this matrix anticommutes with the 

(a) Show that this theory is invariant with respect to 


4’i -t 7 5 


and that this symmetry forbids the appearance of a fermion mass. 

(b) Show that this theory is renormalizable in 2 dimensions (at the level of dimen¬ 
sional analysis). 

(c) Show that the functional integral for this theory can be represented in the fol¬ 
lowing form: 



j 7V7V exp 



crtl’ii’i 



where <j(x) (not to be confused with a Pauli matrix) is a new scalar field with 
no kinetic energy terms. 
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Compute the leading correction to the effective potential for a by integrating over 
the fermion fields You will encounter the determinant of a Dirac operator; 
to evaluate this determinant, diagonalize the operator by first going to Fourier 
components and then diagonalizing the 2x2 Pauli matrix associated with each 
Fourier mode. (Alternatively, you might just take the determinant of this 2x2 
matrix.) This 1-loop contribution requires a renormalization proportional to a 2 
(that is, a renormalization of g 2 ). Renormalize by minimal subtraction. 

Ignoring two-loop and liiglier-order contributions, minimize this potential. Show 
that the a field acquires a vacuum expectation value which breaks the symmetry 
of part (a). Convince yourself that this result does not depend on the particular 
renormalization condition chosen. 

Note that the effective potential derived in part (e) depends on g and N accord¬ 
ing to the form 

VeffWd) = A r • f(g 2 N). 

(The overall factor of N is expected in a theory with N fields.) Construct a 
few of the higher-order contributions to the effective potential and show that 
they contain additional factors of A r_1 which suppress them if we take the limit 
N -5- oo, {g 2 N) fixed. In this limit, the result of part (e) is unambiguous. 




Chapter 12 


The Renormalization Group 


In the past two chapters, our main goal has been to determine when, and 
how, the cancellation of ultraviolet divergences in quantum field theory takes 
place. We have seen that, in a large class of field theories, the divergences 
appear only in the values of a few parameters: the bare masses and coupling 
constants, or, in renormalized perturbation theory, the counterterms. Aside 
from the shift in these parameters, virtual particles with very large momenta 
have no effect on computations in these theories. 

The cancellation of ultraviolet divergences is essential if a theory is to 
yield quantitative physical predictions. But, at a deep level, the fact that 
high-momentum virtual quanta can have so little effect on a theory is quite 
surprising. One of the essential features of quantum field theory is locality, that 
is, the fact that fields at different spacetime points are independent degrees of 
freedom with independent quantum fluctuations. The quantum fluctuations 
at arbitrarily short distances appear in Feynman diagram computations as 
virtual quanta with arbitrarily high momenta. In a renormalizable theory, the 
loop integrals over virtual-particle momenta are always dominated by values 
comparable to the finite external particle momenta. But why? It is not easy 
to understand how the quantum fluctuations associated with extremely short 
distances can be so innocuous as to affect a theory only through the values of 
a few of its parameters. 

This chapter begins with a physical picture, due to Kenneth Wilson, that 
explains this unusual and counterintuitive simplification. This picture general¬ 
izes the idea of the distance- or scale-dependent electric charge, introduced at 
the end of Chapter 7, and suggests that all of the parameters of a renormaliz¬ 
able field theory can usefully be thought of as scale-dependent entities. We will 
see that this scale dependence is described by simple differential equations, 
called renormalization group equations. The solutions of these equations will 
lead to physical predictions of a completely new type: predictions that, un¬ 
der certain circumstances, the correlation functions of a quantum field exhibit 
unusual but computable scaling laws as a function of their coordinates. 
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12.1 Wilson’s Approach to Renormalization Theory 

Wilson’s method is based on the functional integral approach to field theory, in 
which the degrees of freedom of a quantum field are variables of integration. In 
this approach, one can study the origin of ultraviolet divergences by isolating 
the dependence of the functional integral on the short-distance degrees of 
freedom of the field.* In this section, we will illustrate this idea in the simplest 
example of o ' theory. 

To make our analysis more concrete, we will drop the elegant but some¬ 
what mysterious method of dimensional regularization in this section and 
instead use a sharp momentum cutoff. Since we will be working here only in 
<j> 4 theory, we will not be concerned that this cutoff makes it difficult to satisfy 
Ward identities. Wilson’s analysis can be adapted to QED and other situa¬ 
tions where this subtlety is important, but the case of <j> 4 theory is sufficient 
to give us the basic qualitative results of this approach. 

In Section 9.2, we constructed the Green’s functions of 0 4 theory in terms 
of a functional integral representation of the generating functional Z[J], The 
basic integration variables are the Fourier components of the field <p(k), so 
Z[.J] is given concretely by the expression 

Z[J\ = J V<f>e i ^ [C+m = (n J d<p(k)J e */ [£+J ^. (12.1) 

To impose a sharp ultraviolet cutoff A, we restrict the number of the integra¬ 
tion variables displayed in (12.1). That is, we integrate only over <j>(k) with 
|fc| < A, and set <j>(k) = 0 for |fc| > A. 

This modification of the functional integral suggests a method for assess¬ 
ing the influence of the quantum fluctuations at very short distances or very 
large momenta. In the functional integral representation, these fluctuations 
are represented by the integrals over the Fourier components of cp with mo¬ 
menta near the cutoff. Why not explicitly perform the integrals over these 
variables? Then we can compare the result to the original functional integral, 
and determine precisely the influence of these high-momentum modes on the 
physical predictions of the theory. 

Before beginning this analysis, though, we must introduce one modifica¬ 
tion. At first sight, it seems most natural to define the ultraviolet cutoff in 
Minkowski space. However, a cutoff k 2 < A 2 is not completely effective in con¬ 
trolling large momenta, since in lightlike directions the components of k can 
be very large while k 2 remains small. We will therefore consider the cutoff to 
be imposed on the Euclidean momenta obtained after Wick rotation. Equiv¬ 
alently, we consider the Euclidean form of the functional integral, presented 
in Section 9.3, and restrict its variables oik), with k Euclidean, to |fc| < A. 


*Wilson’s ideas are reviewed in K. G. Wilson and J. Kogut, Plivs. Repts. 12C, 
75 (1974). 
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The transition to Euclidean space also brings us closer to the connec¬ 
tion between renormalization theory and statistical mechanics advertised in 
Chapter 8. As we saw in Section 9.3, the Euclidean functional integral for <p 4 
theory has precisely the same form as the continuum description of the statis¬ 
tical mechanics of a magnet. The field (j>(x) is interpreted as the fluctuating 
spin field s(x). A real magnet is built of atoms, and the atomic spacing pro¬ 
vides a physical cutoff, a shortest distance over which fluctuations can take 
place. The cut-off functional integral models the effects of this atomic size in 
a crude way. 

By pursuing this analogy, we can derive some physical intuition about the 
effects of the ultraviolet cutoff in a field theory. In a magnet, it is quite easy 
to visualize statistical fluctuations of the spins at the atomic scale. In fact, for 
values of the temperature away from any critical points, the statistical fluc¬ 
tuations are restricted to this scale; over distances of tens of atomic spacings, 
the magnet already shows its homogeneous macroscopic behavior. We have 
seen in Chapter 8 that we can approximate the correlation function of the 
spin field by the propagator of a Euclidean < i> 4 theory. In this approximation, 


(s(x)s(O)) = 


/ 


n ik'X 


crk 


(2tt) 4 k 2 +rn 2 |.r|—>oo 47T 2 |;i 


0 —m\x\ 


( 12 . 2 ) 


As long as the temperature is far from the critical temperature, the size of the 
“mass” to is determined by the one natural scale in the problem, the atomic 
spacing. Thus, we expect to A. In our field theory calculations, we were 
specifically interested in the situation where to <C A, and we adjusted the 
parameters of the theory to satisfy this condition. In describing a magnet, it 
appears that no such adjustment is called for. 

However, we saw in Chapter 8 that there is one circumstance in which 
the correlations of the spin field are much longer than the atomic spacing, so 
that, indeed, to <C A. When the spin system begins to magnetize, just in the 
vicinity of the critical point, the spins become correlated over arbitrarily long 
distances as the fluctuating spins attempt to choose their eventual direction of 
magnetization. To study these long-range correlations in a magnet, one must 
carefully adjust the temperature to bring the system into the vicinity of the 
phase transition. In the same way, we can imagine making a fine adjustment of 
the parameter to of ( ft 4 theory to bring the quantum field theory into a region 
of parameters where we do find correlations of the field d>(x) over distances 
much larger than 1/A. 


Integrating Over a Single Momentum Shell 

With this introduction, we will now carry out the integration over the high- 
momentum degrees of freedom of cp. We begin by writing the functional in¬ 
tegral (12.1) more explicitly for the case of ( j) 4 theory. We apply the cutoff 
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prescription described earlier, and set J = 0 for simplicity. Then 

z = j[ v 4 >\a exp (-J, al d x + \m 2 (f 2 + ^ 4 ]), (12.3) 

where 

w*= n d(f>(k). (12.4) 

|*|<A 

In the Lagrangian of Eq. (12.3), m and A are the bare parameters, and so there 
are no counterterms. As in our study of the superficial degree of divergence, it 
will be useful to carry out this analysis in an arbitrary spacetime dimension d. 

We now divide the integration variables <j>(k) into two groups. Choose a 
fraction b < 1. The variables <f>(k) with bA < \k\ < A are the high-momentum 
degrees of freedom that we will integrate over. To label these degrees of free¬ 
dom, let us define 

|(fc) = J <t>(k) for bA < \k\ < A; 

10 otherwise. 

Next, let us define a new (j>(k), which is identical to the old for |/c| < bA and 
zero for |fc| > bA. Then we can replace the old q> in the Lagrangian with <f> + <f >, 
and rewrite Eq. (12.3) as 

z = J v 4> J V<f> exp {-J d d zfyd»<f> + d^f + ^m 2 (<f> + 4>f + ^(<f> + ^) 4 ]) 

= jV4>e~-f £{0) J Vcp exp jd d x^(d^4>) 2 + ^m 2 ^ 2 

+ + qT^ 4 )]) ' 

In the final expression we have gathered all terms independent of <j> into C((f>). 
Note that quadratic terms of the form 4>4> vanish, since Fourier components 
of different wavelengths are orthogonal. 

The next few paragraphs will explain how to perform the integral over cp. 
This integration will transform (12.5) into an expression of the form 

z = J [DfltA exp (- jd d x £ eff ), (12.6) 

where C e ^{cp) involves only the Fourier components <j>(k) with |A;| < bA. We 
will see that £ e fr(<p) = £(<?) plus corrections proportional to powers of A. 
These correction terms compensate for the removal of the large-fc Fourier 
components <j>, by supplying the interactions among the remaining that 
were previously mediated by fluctuations of the <j>. 

To carry out the integrals over the (/,•). we use the same method that we 
applied in Section 9.2 to derive Feynman rules. In fact, we will see below that 
the new terms in jC e g can be written in a diagrammatic form. In this analysis, 
we treat the quartic terms in (12.5), all proportional to A, as perturbations. 
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Since we are mainly interested in the situation to 2 -C A 2 , we will also treat 
the mass term \m 2 0 2 as a perturbation. Then the leading-order term in the 
portion of the Lagrangian involving 0 is 

/ 4 ) = j ^o*(k)k-o(k). (12.7) 

6A<|*t<A 

This term leads to a propagator 


0{k)0{p) = 


JV0e~I £ °0{k)0{p) 

JV0e~f £o 


-^ I ('2-)' l <V d ' i (k + i>)<r)(k). 


where 



if bA <\k\< A; 
otherwise. 


( 12 . 8 ) 


(12.9) 


We will regard the remaining 0 terms in Eq. (12.5) as perturbations, and 
expand the exponential. The various contributions from these perturbations 
can be evaluated by using Wick’s theorem with (12.8) as the propagator. 

First consider the term that results from expanding to one power of the 
0 2 0 2 term in the exponent of (12.5). We find 



1 

2 


f d d k i 

J (2n) d 


p0(k 1 )0(-k 1 ), 


( 12 . 10 ) 


where the coefficient p is the result of contracting the two 0 fields: 


A r cl d k 1 A 1 - b d ~ 2 d _ 2 

2 J (2n) d T 2 ~ (47r) d / 2 T(|) d- 2 1 

'.A- A- <A 


( 12 . 11 ) 


The term (12.10) could just as well have arisen from an expansion of the 
exponential 

exp(— Jd d x±p0 2 + •••)• (12-12) 

We will soon see that the rest of the perturbation series also organizes itself 
into this form. The coefficient p therefore gives a positive correction to the 
m 2 term in C. 

The higher orders of the perturbation theory in the correction terms can 
be worked out in a similar way. As in our derivation of the standard pertur¬ 
bation theory for 0 4 theory, it is useful to adopt a diagrammatic notation. 
Represent the propagator (12.8) by a double line. This propagator will con¬ 
nect pairs of fields 0 from the various quartic interactions. Represent the fields 
0 in these interactions, which are not integrated over, as single external lines. 
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Then, for example, the contribution of (12.10) corresponds to the following 
diagram: 


At order A 2 , we will have, among other contributions, terms involving the 
contractions of two interaction terms A <f> 2 <p 2 . Each term corresponds to a ver¬ 
tex connecting two single lines and two double lines. There are two possible 
contractions: 

(12.13) 


Of these, the first, which is a disconnected diagram, supplies the order-A 2 
term in the exponential (12.12). The second is a new contribution, which will 
become a correction to the <p 4 interaction in £(d>). 

Let us now evaluate this second contribution. For simplicity, we consider 
the limit in which the external momenta carried by the factors <j) are very 
small compared to bA, so we can ignore them. Then this diagram has the 
value 

-ljd d x(<p\ (12.14) 


where 


( x \ 2 

f d d k / 1 \ 

UJ J 

(2Tr) d \k 2 ) 


6A<|A|<A 


3A 2 1 

d->4 16-7T 2 ^ ^ 


~3A 2 (1 ~ b d ~ 4 ) irf _ 4 

(47r) d / 2 T(|) d — 4 


(12.15) 


The 2 in the numerator counts the two possible contractions; there are no 
additional combinatoric factors from counting external legs or vertices. In 
the analysis of ( ft 4 theory in Section 10.2, we encountered a similar diagram, 
integrated over a range of momenta from 0 to A, producing a logarithmic 
ultraviolet divergence. In Wilson’s treatment this divergence is not a pathol¬ 
ogy but simply a sign that the diagram is receiving contributions from all 
momentum scales. Indeed, it receives an equal contribution from each loga¬ 
rithmic interval between the momentum scales m and A. We will see below 
that the (finite) contribution to this diagram from each momentum interval 
has a natural physical importance. 

The diagrammatic perturbation theory we have described not only gen¬ 
erates contributions proportional to <f> 2 and <p 4 but also to higher powers of <p. 
For example, the following diagram generates a contribution to a <p 6 interac¬ 
tion: 


@{pl+P2+p 3 )■ (12.16) 


(X 


(pi + P'2 + p 3 ) 2 
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There are also derivative interactions, which arise when we no longer neglect 
the external momenta of the diagrams. A more exact treatment would Taylor- 
expand in these momenta; for instance, in addition to expression (12.14), we 
would obtain terms with two powers of external momenta, which we could 
rewrite as 

f dd x'il<f>Hdn<t>) 2 - (12.17) 

We would also find terms with four, six, and more powers of the momenta 
carried by the 4>. In general, the procedure of integrating out the cp generates 
all possible interactions of the fields and their derivatives. 

The diagrammatic corrections can be simplified slightly by resumming 
them as an exponential. We have seen already in (12.13) that our diagram¬ 
matic expansion generates disconnected diagrams. By the same combinatoric 
argument that we used in Eq. (4.52), we can rewrite the sum of the series as 
the exponential of the sum of the connected diagrams. This leads precisely to 
expression (12.6), with 

£ e ff = ^(c^) 2 + ^ m 2 (f> 2 + -jjAd> 4 + (sum of connected diagrams). (12.18) 

The diagrammatic contributions include corrections to m 2 and A, as well as 
all possible higher-dimension operators. We can now use the new Lagrangian 
Ce S (</>) to compute correlation functions of the (j>{k ), or to compute 5-matrix 
elements. Since the <p(k) include only momenta up to bA, the loop diagrams 
in such a calculation would be integrated only up to that lowered cutoff. The 
correction terms in (12.18) precisely compensate for this change. 

One might well be puzzled by the appearance of higher-dimension opera¬ 
tors in Eq. (12.18). We chose the original Lagrangian of ( ft 4 theory to contain 
only renormalizable interactions. At first sight, it is disturbing that all pos¬ 
sible nonrenormalizable interactions appear when we integrate out the vari¬ 
ables 4>. However, we will see below that our procedure actually keeps the 
contributions of these nonrenormalizable interactions under control. In fact, 
our analysis will imply that the presence of nonrenormalizable interactions 
in the original Lagrangian, defined to be used with very large cutoff A, has 
negligible effect on physics at scales much less than A. 

Renormalization Group Flows 

Let us now make a more careful comparison of the new functional integral 
(12.6) and the one we started with (12.3). The most convenient way to do 
this is to rescale distances and momenta in (12.6) according to 

k! = k/b , x' = xb, (12.19) 

so that the variable k 1 is integrated over \k'\ < A. Let us express the explicit 
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form of (12.18) schematically as 

1 


/ 


o£ e ff = / d x 


2 (1 + A + ^(m 2 + A m 2 )<jd 2 

+ — (A 4- AA)^ 1 + A C{dn<p) 1 + A DcfP + 


In terms of the rescaled variable x ', this becomes 


j d d x C e ff = J d d x' l 


-d 


i(l + A Z)b 2 (d^y 2 + i(m 2 + A m 2 )<p 2 


+ -(A + AA)# 1 + A Cb 4 (d'^) 4 + A Dcp b + 


( 12 . 20 ) 


( 12 . 21 ) 


Throughout this analysis, we have treated all terms beyond the first as small 
perturbations. As long as the original couplings are small, this is still a valid 
approximation in treating (12.21). 

The original functional integral led to the propagator (12.8). The new 
action (12.21) will give rise to exactly the same propagator, if we rescale the 
field cj) according to 

4>' = [b' 2 ~ d ( 1 + A Z)\ 1/2 <j>. (12.22) 

After this rescaling, the unperturbed action returns to its initial form, while 
the various perturbations undergo a transformation: 


j d d x£, e ff = J d d x' 


+ ^A l 4> l4 + C'(d'^ l ) 4 +D'^ 6 + 


(12.23) 


The new parameters of the Lagrangian are 

m' 2 = ( m 2 + Am 2 )(l + AZ) -1 6 -2 , 
A' = (A +AA)(1 + A Z)-' 2 b d -\ 

C' = (C + AC)(1 + AZ)~' 2 b d , 

D' = (D + AD) (l + AZ)~ 3 b 2d ~ 6 , 


and so on. (The original Lagrangian had C = D = 0, but the same equations 
would apply if the initial values of C and D were nonzero.) All of the correc¬ 
tions, Am 2 , AA, and so on, arise from diagrams and thus are small compared 
to the leading terms if perturbation theory is justified. 

By combining the operation of integrating out high-momentum degrees 
of freedom with the rescaling (12.19), we have rewritten this operation as a 
transformation of the Lagrangian. Continuing this procedure, we could inte¬ 
grate over another shell of momentum space and transform the Lagrangian 
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further. Successive integrations produce further iterations of the transforma¬ 
tion (12.24). If we take the parameter b to be close to 1, so that the shells 
of momentum space are infinitesimally thin, the transformation becomes a 
continuous one. We can then describe the result of integrating over the high- 
momentum degrees of freedom of a field theory as a trajectory or a flow in 
the space of all possible Lagrangians. 

For historical reasons, these continuously generated transformations of 
Lagrangians are referred to as the renormalization group. They do not form a 
group in the formal sense, because the operation of integrating out degrees of 
freedom is not invertible. On the other hand, they are most certainly connected 
to renormalization, as we will now see. 

Imagine that we wish to compute a correlation function of fields whose 
momenta p-, are all much less than A. We could compute this correlation func¬ 
tion perturbatively using either the original Lagrangian £, or the effective La- 
grangian £ e ff obtained after integrating over all momentum shells down to the 
scale of the external momenta p,. Both procedures must ultimately yield the 
same result. But in the first case, the effects of high-momentum fluctuations 
of the field do not show up until we compute loop diagrams. In the second 
case, these effects have already been absorbed into the new coupling constants 
(to', A', etc.), so their influence can be seen directly from the Lagrangian. In 
the first procedure, the large shifts from the original (bare) parameters to the 
values appropriate to low-momentum processes appear suddenly in one-loop 
diagrams, and seem to invalidate the use of perturbation theory. In the sec¬ 
ond approach, these corrections are introduced slowly and systematically. A 
perturbative treatment is valid at every step as long as the effective coupling 
constants such as A' remain small. 

However, the parameters of the effective Lagrangian may be very different 
from those of the original Lagrangian, since we must iterate the transforma¬ 
tion (12.24) many times to get from the large momentum A down to the 
momentum scale of typical experiments. Let us therefore look more closely at 
how the Lagrangian tends to vary under the renormalization group transfor¬ 
mations. 

The simplest case to consider is a Lagrangian in the vicinity of the point 
m 2 = A = C = D = ■ ■ ■ = 0, where all the perturbations vanish. We have 
defined our transformation so that this point is left unchanged; we say that 
the free-held Lagrangian 

Co = h(d tl <p) 2 (12.25) 

is a fixed point of the renormalization group transformation. 

In the vicinity of Co, we can ignore the terms Am 2 , AA, etc., in the 
iteration equations (12.24) and keep only those terms that are linear in the 
perturbations. This gives an especially simple transformation law: 

to' 2 = m 2 b~ 2 , A' = A b d ~ A , C'=Cb d , D'=Db 2d ~ e , etc. (12.26) 
Since b < 1, those parameters that are multiplied by negative powers of b 
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grow, while those that are multiplied by positive powers of b decay. If the 
Lagrangian contains growing coefficients, these will eventually carry it away 
from Co¬ 
lt is conventional to speak of the various terms in the effective Lagrangian 
as a set of local operators that can be added as perturbations to Co. We call 
the operators whose coefficients grow during the recursion procedure relevant 
operators. The coefficients that die away are associated with irrelevant oper¬ 
ators. For example, the scalar field mass operator <f> 2 is always relevant, while 
the 4> 4 operator is relevant if d < 4. If the coefficient of some operator is mul¬ 
tiplied by b° (for example, the operator <f> 4 in d = 4), we call this operator 
marginal ; to find out whether its coefficient grows or decays, we must include 
the effect of higher-order corrections. 

In general, an operator with N powers of <f> and M derivatives has a 
coefficient that transforms as 

C'jf'M = b v ' d ' 1 •" d (\v,u- (12.27) 

Notice that the coefficient is just (djv,M— d), where d-N,M is the mass dimension 
of the operator as computed at the end of Section 10.1. In other words, relevant 
and marginal operators about the free theory Co correspond precisely to super- 
renormalizable and renormalizable interaction terms in the power-counting 
analysis of Section 10.1. 

We can also understand the evolution of coefficients near the free-field 
fixed point using straightforward dimensional analysis. An operator with mass 
dimension df has a coefficient with dimension (mass) d_di . The natural order 
of magnitude for this mass is the cutoff A. Thus, if d-, < d. the perturbation 
is increasingly important at low momenta. On the other hand, if d; > d. the 
relative size of this term decreases as ( p/A) di ~ d as the momentum p —>- 0; thus 
the term is truly irrelevant. 

We have now shown that, at least in the vicinity of the zero-coupling 
fixed point, an arbitrarily complicated Lagrangian at the scale of the cutoff 
degenerates to a Lagrangian containing only a finite number of renormaliz¬ 
able interactions. It is instructive to compare this result with the conclusions 
of Chapter 10. There we took the philosophy that the cutoff A should be dis¬ 
posed of by taking the limit A —> oo as quickly as possible. We found that 
this limit gives well-defined predictions only if the Lagrangian contains no 
parameters with negative mass dimension. From this viewpoint, it seemed ex¬ 
ceedingly fortunate that QED, for example, contained no such parameters, 
since otherwise this theory would not yield well-defined predictions. 

Wilson’s analysis takes just the opposite point of view, that any quantum 
field theory is defined fundamentally with a cutoff A that has some physical 
significance. In statistical mechanical applications, this momentum scale is 
the inverse atomic spacing. In QED and other quantum field theories appro¬ 
priate to elementary particle physics, the cutoff would have to be associated 
with some fundamental graininess of spacetime, perhaps a result of quantum 
fluctuations in gravity. We discuss some speculations on the nature of this 
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Figure 12.1. Renormalization group flows near the free-field fixed point in 

scalar field theory: (a) d > 4; (b ) d — 4. 

cutoff in the Epilogue. But whatever this scale is, it lies far beyond the reach 
of present-day experiments. The argument we have just given shows that this 
circumstance explains the renormalizability of QED and other quantum field 
theories of particle interactions. Whatever the Lagrangian of QED was at 
its fundamental scale, as long as its couplings are sufficiently weak, it must 
be described at the energies of our experiments by a renormalizable effective 
Lagrangian. 

On the other hand, we should emphasize that these simple conclusions can 
be altered by sufficiently strong field theory interactions. Away from the free- 
held fixed point, the simple transformation laws (12.26) receive corrections 
proportional to higher powers of the coupling constants. If these corrections 
are large enough, they can halt or reverse the renormalization group flow. They 
could even create new fixed points, which would give new types of A —>■ oo 
limits. 

To illustrate the possible influences of interactions in a relatively simple 
context, let us discuss the renormalization group flows near £ 0 for the specific 
case of <f > 4 theory. It is instructive to consider the three cases d > 4, d = 4, and 
d < 4 in turn. When d > 4, the only relevant operator is the scalar field mass 
term. Then the renormalization group flows near Co have the form shown 
in Fig. 12.1(a). The ( j ) 4 interaction and possible higher-order interactions die 
away, while the mass term increases in importance. 

In previous chapters, we have always discussed ( j > 4 theory in the limit in 
which the mass is small compared to the cutoff. Let us take a moment to 
rewrite this condition in the language of renormalization group flows. In the 
course of the flow, the effective mass term m ' 2 becomes large and eventually 
comes to equal the current cutoff. For example, near the free-field fixed point, 
after n iterations, m' 2 = m 2 b~ 2n , and eventually there is an n such that 
m' 2 ~ A 2 . At this point, we have integrated out the entire momentum region 
between the original A and the effective mass of the scalar field. The mass term 
then suppresses the remaining quantum fluctuations. In general, the criterion 
that the scalar field mass is small compared to the cutoff is equivalent to 
the statement that m ' 2 ~ A 2 only after a large number of iterations of the 
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renormalization group transformation. 

This criterion is met whenever the initial conditions for the renormaliza¬ 
tion group flow are adjusted so that the trajectory passes very close to a fixed 
point. In principle, the flow could begin far away, along the direction of an ir¬ 
relevant operator. The original value of to 2 need not be particularly small, as 
long as this original value is canceled by corrections arising from the diagram¬ 
matic contributions to £ e ff • Thus we could imagine constructing a scalar field 
theory in d > 4 by writing a complicated nonlinear Lagrangian, but adjusting 
the original m 2 so the trajectory that begins at this Lagrangian eventually 
passes close to the free-held fixed point £o- In this case, the effective theory at 
momenta small compared to the cutoff should be extremely simple: It will be 
a free held theory with negligible nonlinear interaction. As will be discussed in 
the next chapter, this remarkable prediction has been verihed in mathemat¬ 
ical models of magnetic systems in more than four dimensions: Even though 
the original model is highly nonlinear, the correlation function of spins near 
the phase transition has the free-held form given by the higher-dimensional 
analogue of Eq. (12.2). 

Next consider the case d = 4. For this case, Eq. (12.26) does not give 
enough information to tell us whether the <p 4 interaction is important or unim¬ 
portant at large distances. So we must go back to the complete transformation 
law (12.24). The leading contribution to AA is given by Eq. (12.15). The lead¬ 
ing contribution to AZ is of order A 2 and can be neglected. (This is just what 
happened with the hrst correction to Sz in Section 10.2.) Thus we hnd the 
transformation 

Q \ 2 

A ' = A “l6^ l0g(1/6) - (12 ' 28) 

This says that A slowly decreases as we integrate out high-momentum degrees 
of freedom. 

The diagram contributing to the correction AA has the same structure 
as the one-loop diagrams computed in Section 10.2. In fact, these are essen¬ 
tially the same diagrams, and differ only in whether the integrals are carried 
out iteratively or all at once. However, whereas the diagrams in Section 10.2 
had ultraviolet divergences, the corresponding diagram in Wilson’s approach 
is well defined and gives the coefficient of a simple evolution equation of the 
coupling constant. This transformation gives a first example of the reinterpre¬ 
tation of ultraviolet divergences that we will make in this chapter. 

The transformation law (12.28) implies that the renormalization group 
flows near £ 0 have the form shown in Fig. 12.1(b), with one slowly decaying 
direction. If we follow the flows far enough, the behavior should again be that 
of a free field. This picture has the puzzling implication that four-dimensional 
interacting <p 4 theory does not exist in the limit in which the cutoff goes to 
infinity. We will discuss this result further—and explain why it nevertheless 
makes sense to use cp A theory as a model field theory—in Section 12.3. 

Finally consider the case d < 4. Now A becomes a relevant parameter. 
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Figure 12.2. Renormalization group flows near the free-field fixed point in 
scalar field theory: d < 4. 


The theory thus flows away from the free theory £ 0 as we integrate out de¬ 
grees of freedom; at large distances, the <p 4 interaction becomes increasingly 
important. However, when A becomes large, the nonlinear corrections such 
as that displayed in Eq. (12.28) must also be considered. If we include this 
specific effect in d < 4, we find the recursion formula 


A' = 



3A 2 b d ~ A - 1 
(47r) d / 2 T(4) 4 -d 



b 


d -4 


(12.29) 


This equation implies that there is a value of A at which the increase due 
to rescaling is compensated by the decrease caused by the nonlinear effect. 
At this value, A is unchanged when we integrate out degrees of freedom. The 
corresponding Lagrangian is a second fixed point of the renormalization group 
flow. In the limit d —1 4, the flow (12.29) tends to (12.28) and so the new fixed 
point merges with the free field fixed point. For d sufficiently close to 4, the 
new fixed point will share with Cq the property that the mass parameter m 2 is 
increased by the iteration. Then the mass operator will be a relevant operator 
near the new fixed point, so that the renormalization group flows will have 
the form shown in Fig. 12.2. 

In this example, the new fixed point of the renormalization group had 
a Lagrangian with couplings weak enough that the transformation equations 
could be computed in perturbation theory. In principle, one could also find 
fixed points whose Lagrangians are strongly coupled, so that the renormal¬ 
ization group transformations cannot be understood by Feynman diagram 
analysis. Many examples of such fixed points are known in exactly solvable 
model field theories in two dimensions.t However, up to the present, all of 
the examples of quantum field theories that are important for physical appli¬ 
cations have been found to be controlled either by the free field fixed point 
or by fixed points, like the one described in the previous paragraph, that ap¬ 
proach the free-field fixed point in a specific limit. No one understands why 
this should be. This observation implies that Feynman diagram analysis has 


i\Ve mention some of these examples, and discuss other nonperturbative ap¬ 
proaches to quantum field theory, in the Epilogue. 
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unexpected power in evaluating the physical consequences of quantum field 
theories. 

One more aspect of d> 4 theory deserves comment. Since the mass term, 
m 2 (p 2 , is a relevant operator, its coefficient diverges rapidly under the renor¬ 
malization group flow. We have seen above that, in order to end up at the 
desired value of m 2 at low momentum, we must imagine that the value of m 2 
in the original Lagrangian has been adjusted very delicately. This adjustment 
has a natural interpretation in a magnetic system as the need to sensitively 
adjust the temperature to be very close to the critical point. However, it seems 
quite artificial when applied to the quantum field theory of elementary par¬ 
ticles, which purports to be a fundamental theory of Nature. This problem 
appears only for scalar fields, since for fermions the renormalization of the 
mass is proportional to the bare mass rather than being an arbitrary addi¬ 
tive constant. Perhaps this is the reason why there seem to be no elementary 
scalar fields in Nature. We will return to this question in the Epilogue. 


12.2 The Callan-Symanzik Equation 

Wilson’s picture of renormalization, as a flow in the space of possible La- 
grangians, is beautifully intuitive, and gives us a deep understanding of why 
Nature should be describable in terms of renormalizable quantum field theo¬ 
ries. In addition, however, this idea can be applied to extract further quan¬ 
titative predictions from these theories. In the remainder of this chapter we 
will develop a formalism for extracting these predictions. Specifically, we will 
see that Wilson’s picture leads to predictions for the form of the high- and 
low-momentum behavior of correlation functions. In the simplest cases, the 
correlation functions turn out to scale as powers of their external momenta, 
with power laws that do not appear at any fixed order of perturbation theory. 

It is possible to derive these predictions directly from Wilson’s procedure 
of integrating out slices in momentum space, as Wilson originally did. How¬ 
ever, now that we understand the basic idea of renormalization group flows, 
it will be technically easier to work in the more familiar context of ordinary 
renormalized perturbation theory. The discussion of the previous section was 
physically motivated but technically complex. It involved awkward integrals 
over finite domains, and used the artificial parameter 6, which must cancel 
out in any final results. Furthermore, we know from Section 7.5 that a cut¬ 
off regulator leads to even more trouble in QED, since it conflicts with the 
Ward identity. The discussion of the present section will be much more ab¬ 
stract and formal, but it will remove these technical problems. In this section 
and the next we will derive a flow equation for the coupling constant, similar 
to the one we derived in Section 12.1. To obtain the flows of the most general 
Lagrangians, we will need some additional tools, to be developed in Sections 
12.4 and 12.5. 
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How can we hope to obtain information on renormalization group flows 
from the expressions for renormalized Green’s functions, in which the cutoff 
has already been taken to infinity? We must first realize that renormalized 
quantum field theories correspond to a restricted class of the full set of possible 
Lagrangians that we considered in the previous section. In Wilson’s language, 
a renormalized field theory with the cutoff taken arbitrarily large corresponds 
to a trajectory that takes an arbitrarily long time to evolve to a large value of 
the mass parameter. Such a trajectory must, then, pass arbitrarily close to a 
fixed point, which we will assume to be the weak-coupling fixed point. In the 
slow evolution past this fixed point, the irrelevant operators in the original 
Lagrangian die away, and we are left only with the relevant and marginal 
operators. The coefficients of these operators are in one-to-one correspondence 
with the parameters of the renormalizable field theory. Thus, in working with 
a renormalized field theory, we are throwing away information on the evolution 
of irrelevant perturbations, but keeping information on the flows of relevant 
and marginal perturbations. 

The flows of these parameters cannot be determined from the cutoff de¬ 
pendence, because, in this framework, the cutoff has already been sent to 
infinity. However, we have an alternative, though more abstract, tool at our 
disposal. The parameters of a renormalized field theory are determined by a 
set of renormalization conditions, which are applied at a certain momentum 
scale (called the renormalization scale). By looking at how the parameters of 
the theory depend on the renormalization scale, we can recover the informa¬ 
tion contained in the renormalization group flows of the previous section. 

We consider first the specific case of <p 4 theory in four dimensions, where 
the coupling constant A is dimensionless and the corresponding operator is 
marginal. For simplicity, we will also assume that the mass term mr has been 
adjusted to zero, so that the theory sits just at its critical point. We will 
perform this analysis in Minkowski space, using spacelike reference momenta. 
However, the analysis would be essentially identical if carried out in Euclidean 
space. If we wish to consider renormalization group predictions at timelike 
momenta, we must consider the possibilities of new singularities which make 
the analysis more complicated. These include both physical thresholds and the 
Sudakov double logarithms discussed in Section 6.4. We postpone discussion 
of these complications until Chapters 17 and 18. 


Renormalization Conditions 

To define the theory properly, we must specify the renormalization conditions. 
In Chapter 10 we used a natural set of renormalization conditions (10.19) for 
( ft 4 theory, defined in terms of the physical mass m. However, in a theory where 
m = 0, these conditions cannot be used because they lead to singularities in 
the counterterms. (Consider, for example, the limit mr —» 0 of Eq. (10.24).) 
To avoid such singularities, we choose an arbitrary momentum scale M and 
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impose the renormalization conditions at a spacelike momentum p with p 2 = 
—M 2 -. 


— ( 

dp 2 V 


= 0 at p 2 = —M 2 \ 
j = 0 at p 2 = —M 2 \ 

(12.30) 

= — * A 

at (pi +p 2 ) 2 = {pi + p 3 ) 2 = (pi +pa) 2 = -M 2 . 


The parameter M is called the renormalization scale. These conditions define 
the values of the two- and four-point Green’s functions at a certain point and, 
in the process, remove all ultraviolet divergences. Speaking loosely, we say 
that we are “defining the theory at the scale M”. 

These new renormalization conditions take some getting used to. The 
second condition, in particular, implies that the two-point Green’s function 
has a coefficient of 1 at the unphysical momentum p 2 = —M 2 , rather than on 
shell (at p 2 =0): 

<n| <p(p)<p(-p) 1°) = 4- at p 2 = —M 2 . 

Here cp is the renormalized field, related to the bare field (po by a scale factor 
that we again call Z-. 

<P = Z- 1 / 2 ( p 0 . (12.31) 


This Z, however, is not the residue of the physical pole in the two-point 
Green’s function of bare fields, as it was in Chapters 7 and 10. Instead, we 
now have 

(fl\<po(p)<j>o(~p) |^) = at p 2 = —M 2 . 

p- 

The Feynman rules for renormalized perturbation theory are the same as in 
Chapter 10, with the same relation between Z and the counterterm Sz, 


S z = Z- 1. 


Now, however, the counterterms Sz and S\ must be adjusted to maintain the 
new conditions (12.30). 

The first renormalization condition in (12.30) holds the physical mass of 
the scalar field fixed at zero. We saw in Chapter 10 that, in < p 4 theory, the 
one-loop propagator correction is momentum-independent and is completely 
canceled by the mass renormalization counterterm. At two-loop order, how¬ 
ever, the situation becomes more complicated, and the propagator corrections 
require both mass and field strength renormalizations. In more general scalar 
field theories, such as the Yukawa theory example considered at the end of 
Section 10.2, this complication arises already at one-loop order. Since the field 
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strength renormalization counterterm will play an important role in the dis¬ 
cussion below, it will be helpful to discuss briefly how we will treat this double 
subtraction. 

The evaluation of propagator corrections has some special simplifications 
for the case of a massless scalar field, which we consider here, and specifically 
with the use of dimensional regularization. Consider, for example, the one- 
loop propagator correction in Yukawa theory. In Section 10.2 we found an 
expression of the form 


rd-f) 

A 1-6 */ 2 ’ 


(12.32) 


where A is a linear combination of the fermion mass nif and p 2 . If we compute 
the diagram using massless propagators only, A is proportional to p 2 . Expres¬ 
sion (12.32) has a pole at d = 2, corresponding to the quadratically divergent 
mass renormalization. However, the residue of this pole is independent of p 2 , 
so we can completely cancel the pole with the mass counterterm S m . This al¬ 
lows us to analytically continue (12.32) to d = 4. Then this expression takes 
the form 

v (jv^ + log q? + c )' (12 - 33) 

and gives no additional mass shift but only a field strength renormalization. 
The remaining divergence is canceled by the counterterm Sz ■ If we adopt the 
rule that we should simply continue expressions of the form (12.32) to d = 4, 
we can forget about the counterterm S m altogether. 

In a regularization scheme with a momentum cutoff, the contributions to 
S m and 6z become tangled up with one another. Then it is more awkward to 
define the massless limit. In the following discussion, we will assume the use 
of dimensional regularization. However, to emphasize the physical role of the 
cutoff, we will write expressions of the form (12.33) as 

-r( 1 °g^+C’). (12.34) 

The logarithmically divergent terms proportional to p 2 will agree with the 
divergences obtained with a momentum cutoff; the constant terms will not 
agree, but these will drop out of our final results. 

In cp l theory, where the one-loop propagator correction is momentum- 
independent, the one-loop diagram is simply set to zero by this prescription. 
Then the preceding analysis applies to the two-loop and higher correction 
terms. 

The generalization of the analysis of this section to massive scalar field 
theory requires some additional formalism, which we postpone to Section 12.5. 



410 Chapter 12 The Renormalization Group 


The Callan-Symanzik Equation 


In the renormalization conditions (12.30), the renormalization scale M is ar¬ 
bitrary. We could just as well have defined the same theory at a different 
scale M'. By “the same theory”, we mean a theory whose bare Green’s func¬ 
tions, 

{n\T(f> 0 (x 1 )(f>o{x-2) ■ ■ •^o(.'Cn) |^), 


are given by the same functions of the bare coupling constant Ao and the 
cutoff A. These functions make no reference to M. The dependence on M 
enters only when we remove the cutoff dependence by rescaling the fields 
and eliminating Ao in favor of the renormalized coupling A. The renormalized 
Green’s functions are numerically equal to the bare Green’s functions, up to 
a rescaling by powers of the field strength renormalization Z: 

(fi| T<j){xi)(j){x 2 ) ■ ■ ■ (j>(x n ) |Q) = Z~ n/2 <fi| T<t> o{xi)4>o{x 2) ■ ■ ■ <Po{x n ) |fi) • 

(12.35) 

The renormalized Green’s functions could be defined equally well at another 
scale M', using a new renormalized coupling A' and a new rescaling factor Z'. 

Let us write more explicitly the effect of an infinitesimal shift of M. Let 
GW(i !,••• , x n ) be the connected /r-point function, computed in renormalized 
perturbation theory: 

G {n) ( Xl ,- ■ -,x n ) = <Q| T<j){xi) ■ ••#*„) |fi) connected • (12.36) 

Now suppose that we shift M by SM. There is a corresponding shift in the 
coupling constant and the field strength such that the bare Green’s functions 
remain fixed: 

M ->■ M + 6M, 

A ->■ A + (5A, (12.37) 

4> -> (1 + Sr))<t>. 


Then the shift in any renormalized Green’s function is simply that induced 
by the field rescaling, 

G (n ) ->• (1 + nSi 1 )G {n) . 

If we think of G( n ' 1 as a function of M and A, we can write this transformation 
as 

dG ( n) = + = nSriG 1 n) . (12.38) 

dM d\ 

Rather than writing this relation in terms of SX and Srj, it is conventional 
to define the dimensionless parameters 


M _ M 

o = tt-oX: 7 = — —- on. 

SM ' SM 


(12.39) 


Making these substitutions in Eq. (12.38) and multiplying through by M/SM, 
we obtain 


M TiTJ + <l3 5A + ni G(n) ’ " ' 1 Xn] M ’ = °- 


(12.40) 
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The parameters ft and 7 are the same for every n, and must be independent 
of the x,j. Since the Green’s function G ( ”' is renormalized, ft and 7 cannot 
depend on the cutoff, and hence, by dimensional analysis, these functions 
cannot depend on M. Therefore they are functions only of the dimensionless 
variable A. We conclude that any Green’s function of massless ft 4 theory must 
satisfy 

O r\ 

[M— + ft(\)—+nj(\)]G {n H{x i y,M,\) = 0. (12.41) 

This relation is called the Callan-Symanzik equation.* It asserts that there ex¬ 
ist two universal functions /3(A) and 7 (A), related to the shifts in the coupling 
constant and field strength, that compensate for the shift in the renormaliza¬ 
tion scale M. 

The preceding argument generalizes without difficulty to other massless 
theories with dimensionless couplings. In theories with multiple fields and cou¬ 
plings, there is a 7 term for each field and a ft term for each coupling. For 
example, we can define QED at zero electron mass by introducing a renor¬ 
malization scale as in Eqs. (12.30). The renormalization conditions for the 
propagators are applied at p 2 = —M 2 , and those for the vertex at a point 
where all three invariants are of order —M 2 . Then the renormalized Green’s 
functions of this theory satisfy the Callan-Symanzik equation 

O o 

[m— + /3(e)— + / 172 (e) + 71173 (e)] G (n ’ m )({x i }; M, e) = 0, (12.42) 

where n and m are, respectively, the number of electron and photon fields in 
the Green’s function G ( ” ,m) and 72 and 73 are the rescaling functions of the 
electron and photon fields. 

Computation of ft and 7 

Before we work out the implications of the Callan-Symanzik equation, let us 
look more closely at the functions ft and 7 that appear in it. From their defi¬ 
nitions (12.39), we see that they are proportional to the shift in the coupling 
constant and the shift in the field normalization, respectively, when the renor¬ 
malization scale M is increased. The behavior of the coupling constant as a 
function of M is of particular interest, since it determines the strength of the 
interaction and the conditions under which perturbation theory is valid. We 
will see in the next section that the shift in the field strength is also reflected 
directly in the values of Green’s functions. 

The easiest way to compute the Callan-Symanzik functions is to begin 
with explicit perturbative expressions for some conveniently chosen Green’s 
functions. If we insist that these expressions satisfy the Callan-Symanzik equa¬ 
tion, we will obtain equations that can be solved for ft and 7 . Because the 

+ C. G. Callan, Phvs. Rev. D2, 1541 (1970), K. Symanzik, Comm. Math. Phys. 
18, 227 (1970). 



412 Chapter 12 The Renormalization Group 


M dependence of a renormalized Green’s function originates in the counter- 
terms that cancel its logarithmic divergences, we will find that the 3 and 7 
functions are simply related to these counterterms, or equivalently, to the co¬ 
efficients of the divergent logarithms. The precise formulae that relate 3 and 
7 to the counterterms will depend on the specific renormalization prescription 
and other details of the calculational scheme. At one-loop order, however, the 
expressions for 3 and 7 are simple and unambiguous. 

As a first example, let us calculate the one-loop contributions to /1(A) 
and 7 (A) in massless 0 4 theory. We can simplify the analysis by working in 
momentum space rather than coordinate space. Our strategy will be to apply 
the Callan-Symanzik equation to the diagrammatic expressions for the two- 
and four-point Green’s functions. 

The two-point function is given by 


In massless 0 4 theory, the one-loop propagator correction is completely can¬ 
celed by the mass counterterm. Then the first nontrivial correction to the 
propagator comes from the two-loop diagram and its counterterm, and is of 
order A 2 . Meanwhile, the four-point function is given by 


where we have omitted the canceled one-loop propagator corrections to the 
external legs. The diagrams of order A 3 include nonvanishing two-loop prop¬ 
agator corrections to the external legs. 

To calculate /3, we apply the Callan-Symanzik equation to the four-point 
function: 

[ M 5?7 + m ix + 4 ^ A )] G<4) tPi’ 0 . (12.43) 

Borrowing our result (10.21) from Section 10.2, we can write G (4) as 
G (4) = [~i\ + (-U) 2 [iV(s) + iV(t) + iV(u )] - iS x ] ■ J] 

i 1.1 p > 

where V(s) represents the loop integral in (10.20). Our renormalization con¬ 
dition (12.30) requires that the correction terms cancel at s = t = u = — M 2 . 
The order-A 2 vertex counterterm is therefore 

f :h r <M> 

2{4r) d l' ! 1 " (*(1 

0 


Sx = (-iA ) 2 • 3V(-M 2 ) 


(12.44) 
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The last expression follows from setting m = 0 and p 2 = —M 2 in Eq. (10.23) 
for V{p 2 ). In the limit as d —> 4, Eq. (12.44) becomes 




3A 2 r 1 
2(4 tt ) 2 [2 - d/2 


log M 2 + finite , 


(12.45) 


where the finite terms are independent of M. This counterterm gives G (4) its 
M dependence: 


M— G (4) = TT — 

OM (4t r) 2 -^p 2 ' 


Let us assume for the moment that 7 (A) has no term of order A; we will justify 
this in the next paragraph. Then the Callan-Symanzik equation (12.43) can 
be satisfied to order A 2 only if the 3 function of ( ft 4 theory is given by 

q \ 2 

0W = I^ + ° ( A 3) - (12.46) 

Next, consider the Callan-Symanzik equation for the two-point function: 

\ M ^M +m 'k + 2 7 ( A)]g ( 2 )( p) = 0. (12.47) 

Since, to one-loop order, there are no propagator corrections to G ,(2) , no de¬ 
pendence on M or A is introduced to order A. Thus the 7 function is zero to 
this order: 

7 = 0 + 0(A 2 ). (12.48) 


This justifies the assumption made in the previous paragraph. The two-loop 
propagator correction is divergent, and its counterterm contains a term of 
order A 2 which depends on M. This contributes to the first term in Eq. (12.47). 
Since 0 is of order A 2 and the corrections to G ,(2) are of order A 2 , the leading 
contributions to the second term in (12.47) are of order A 3 . Thus 7 acquires a 
nonzero contribution in order A 2 . This leading contribution to 7 is computed 
in Problem 13.2. 

The preceding example illustrates how 3 and 7 can be calculated in more 
general theories with dimensionless couplings. In such theories, the M depen¬ 
dence of Green’s functions enters through the field-strength and vertex coun¬ 
terterms, which are used to subtract the divergent logarithms. The lowest- 
order expressions for 3 and 7 can be computed directly from these counter¬ 
terms, or from the coefficients of the divergent logarithms. 

In any renormalizable massless scalar field theory, the two-point Green’s 
function has the generic form 

G (2 \p) = 

• * A 2 

= -4 + -4Ulog— + finite) + -^-;{ip 2 Sz)\ H-• (12.49) 

p- p- \ _p- j p- p- 

The M dependence of this expression, to lowest order, comes entirely from 
the counterterm Sz■ Applying the Callan-Symanzik equation to G ( 2 '(p), and 
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neglecting the f3 term (which is always smaller by at least one power of the 
coupling constant), we find 


• O 

-^Al—Sz+2-,^ = 0, 

p z oM p z 


or 


i,, d r 
7 = ~ M WV7 $Z 


(to lowest order). 


2 dM 

To make this result more explicit, note that the counterterm must be 


(12.50) 


Sz = A log + finite 


in order to cancel the divergent logarithm in G (2) . Thus 7 is simply the coef¬ 
ficient of the logarithm: 


7 = —,4 (to lowest order). (12.51) 

In most theories (e.g., Yukawa theory or QED), the first logarithmic diver¬ 
gence in Sz occurs at the one-loop level. However, even in </> 4 theory, formulae 
(12.50) and (12.51) are true for the first nonvanishing term in Sz, in this case 
the two-loop contribution.* By replacing the scalar field propagator (i/p 2 ) 
with a fermion propagator (i/rf), we could repeat this argument line for line 
to compute the 7 function for a fermion field in terms of its field strength 
counterterm Sz- 

We can derive similar expressions for the f3 function of a generic dimen¬ 
sionless coupling constant g, associated with an n-point vertex. Taking propa¬ 
gator corrections into account, the full connected Green’s function, to one-loop 
order, has the general form 

tree-level\ ZlPIloopN / vertex \ /external leg 
diagram J \ diagrams ) \ counterterm / \ corrections 

= (II 7) [-*» - iB lo S Zl2 - i5 s + ( ~W) Y lo S ~2 ~ S Zi)} 

i t i t 

4- finite terms. (12.52) 

In this expression, p t are the momenta on the external legs, and pi 2 represents 
a typical invariant built from these momenta. We assume that renormaliza¬ 
tion conditions are applied at a point where all such invariants are spacelike 
and of order —M 2 . The M dependence of this expression comes from the 
counterterms S g and Szi- Applying the Callan-Symanzik equation, we obtain 

{ 6 »~aY + &9) + g Y \ M ^ 6zt = "• 



*At one loop, formula (12.33) implies that we can also identify A as the coefficient 
of 2/(4— d) in the 1PI self-energy, in the limit d —> 4. This relation changes in higher 
loops. However, Eq. (12.50) remains correct. 
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d 1 

p{g) = M — Sg + -g fei) (to lowest order). (12.53) 

i 

To be more explicit, we note that 

A 2 

8 g = -B log — + finite. 

Thus the 3 function is just a combination of the coefficients of the divergent 
logarithms: 

3(g) = -2B-gJ2 Ai (to lowest order). (12.54) 


Notice that the finite parts of counterterms are independent of M and 
therefore never contribute to 3 or 7 . This means that, to compute the leading 
terms in the Callan-Symanzik functions, we needn’t be too precise in specify¬ 
ing renormalization conditions: Any momentum scale of order M 2 will yield 
the same results. The divergent parts of the counterterms can be estimated 
simply by setting all invariants inside of logarithms equal to M 2 , as we did 
above in our expression for the n-point Green’s function. 

As in the computation of 7 , this argument can be applied almost without 
change to coupling constants for fields with spin. In Yukawa theory, for ex¬ 
ample, we consider the three-point function with one incoming fermion, one 
outgoing fermion, and one scalar, with momenta pi, po, and pa, respectively. 
Then the tree-level expression for the three-point function is 


til 

-r-r — i~W- 
Pi P '2 Pa 


(12.55) 


The one-loop corrections replace the quantity (— ig) by the expression in 
brackets in Eq. (12.52). Then formulae (12.53) and (12.54) hold also for the 
3 function of this theory. 

Similar expressions also apply in QED, though there are a number of 
small complications. The first comes in computing the 7 function for the 
photon propagator. In Eq. (7.74), we saw that the general form of the photon 
propagator in Feynman gauge is 


D^(q) = D{q)(g "" - (12.56) 


The coefficient of the last term in (12.56) depends on the gauge. Fortunately, 
this term drops out of all gauge-invariant observables. Thus it makes sense 
to concentrate on the first term, projecting all external photons onto their 
transverse components. Projecting the photon propagator, we see that D(q) 
satisfies the Callan-Symanzik equation. Since the corrections to this function 
have the form (12.49), the arguments following that formula are valid for 
photons as well as for electrons and scalars. Thus, to leading order, 


i ir d . 

72 = 2 M m s ’- 


1,8 . 

73 “ 2 M dM S *' 


(12.57) 
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where S 2 and S3 are the counterterms defined in Section 10.3. 

Similarly, we may consider the three-point connected Green’s function 
(ip{pi)ip{p-2)A lx {q)'), projected onto transverse components of the photon. At 
leading order, this function equals 



q»q v \ 


The divergent one-loop corrections have the same form, with (—ie | replaced 
by logarithmically divergent terms. Thus, Eq. (12.53) gives the lowest-order 
expression for the 0 function: 

0 (e) = Wwry! < <>j + eS 2 + |<5 3 ). (12.58) 


To find explicit expressions for the Callan-Symanzik functions of QED, 
we must write expressions for the counterterms Ji, S 2 , S3. In Section 10.3, 
we evaluated these counterterms using on-shell renormalization conditions 
with massive fermions. We must now re-evaluate these terms for massless 
fermions and renormalization at —M 2 . Fortunately, we need only evaluate 
the logarithmically divergent pieces of these counterterms, which are identical 
in the two cases. Reading from Eqs. (10.43) and (10.44), we find 



e 2 r(2-f) 
(4 tt) 2 (M 2 ) 2 ~ d / 2 


+ finite, 


e 2 4 r(2-f) 
ORO 2 3 (M 2 ) 2 ~ d / 2 


+ finite. 


(12.59) 


Using formulae (12.57) and (12.59), we obtain at leading order 


72 (e) = 

And from Eq. (12.58), we find 


167T 2 


73(e) = 


127r 2 ' 


0(e) = 


12t r 2 ' 


(12.60) 


(12.61) 


It is important to remember that the expression we have used for S 2 
explicitly assumes the use of Feynman gauge. In fact, 70 depends on the gauge 
parameter, and this makes sense, because Green’s functions of individual ij> 
and ip fields are not gauge invariant. On the other hand, the QED vacuum 
polarization, and therefore 73 and 0, are gauge invariant. 


The Meaning of (3 and 7 

We can obtain a deeper insight into the nature of 0 and 7 by expressing them 
in terms of the parameters of bare perturbation theory: Z , Ao, and A for the 
case of <p 4 theory. 
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First recall that the bare and renormalized field are related by 

4>(p) = Z(M)~ 1 ^' 2 <f>o(p). (12.62) 


This equation expresses the dependence of the field rescaling on M. If M is 
increased by SM, the renormalized field is shifted by 


Sr) 


Z(M + SM)- 1 /' 2 
Z(M)- 1 / 2 


Hence our original definition (12.39) of 7 gives us immediately 


7(A) 


1M d v 
2 YdM Z ' 


(12.63) 


Since Sz = Z — 1 (Eq. (10.17)), this formula is in agreement with (12.50) to 
leading order. Formula (12.63), however, is an exact relation. This expression 
clarifies the relation of 7 to the field strength rescaling. However, it obscures 
the fact that 7 is independent of the cutoff A. To understand this aspect of 
7 , we have to go back to the original definition of this function in terms of 
renormalized Green’s functions, whose cutoff independence follows from the 
renormalizability of the theory. 

Similarly, we can find an instructive expression for 0 in terms of the 
parameters of bare perturbation theory. Our original definition of 0 in 
Eq. (12.39) made use of a quantity S A, defined to be the shift of the renor¬ 
malized coupling A needed to preserve the values of the bare Green’s func¬ 
tions when the renormalization point is shifted infinitesimally. Since the bare 
Green’s functions depend on the bare coupling Ao and the cutoff, this defini¬ 
tion can be rewritten as 


^ = M m x 


Ao ,A 


(12.64) 


Thus the 0 function is the rate of change of the renormalized coupling at 
the scale M corresponding to a fixed bare coupling. Recalling our analysis in 
Section 12.1, it is tempting to associate A (M) with the coupling constant A' 
obtained by integrating out degrees of freedom down to the scale M. With this 
correspondence, the 0 function is just the rate of the renormalization group 
flow of the coupling constant A. A positive sign for the 0 function indicates 
a renormalized coupling that increases at large momenta and decreases at 
small momenta. We can see explicitly that this relation works for cp 4 theory, 
to leading order in A, by comparing Eqs. (12.28) and (12.46). We will justify 
this correspondence further in the following section. 

The equality of the exact formula (12.64) with the first-order formula 
(12.53) again follows from the counterterm definitions (10.17). As with (12.63), 
it is not obvious that this formula for /3(A) is independent of A, but that fact 
again follows from renormalizability. Conversely, it is possible to prove the 



418 Chapter 12 The Renormalization Group 


renormalizability of 4> A theory by demonstrating, order by order in perturba¬ 
tion theory, that expressions (12.63) and (12.64) are independent of Ad 

12.3 Evolution of Coupling Constants 

Now that we have discussed all of the ingredients of the Callan-Symanzik 
equation, let us investigate its implications. We begin by finding the explicit 
solution to the Callan-Symanzik equation for the simplest situation, the two- 
point Green’s function of a scalar field theory. This solution will clarify the 
physical implications of the equation. In particular, it will cement the relation 
suggested at the end of the previous section, which identifies the j3 function 
with the rate of the renormalization group flow of the coupling constant. We 
will then use this relation to discuss the qualitative features of the renormal¬ 
ization group flow in renormalizable field theories. 

Solution of the Callan-Symanzik Equation 

We would like to solve the Callan-Symanzik equation for the two-point Green’s 
function, G' ( 2 '(p), in a theory with a single scalar field. Since G ,( 2 '(p) has 
dimensions of (mass) -2 , we can express its dependence on p and M as 

G^(p) = ^a(-ir/M 2 ). (12.65) 

p- 

This equation allows us to trade the derivative with respect to M for a deriva¬ 
tive with respect to pi 2 . For the remainder of this chapter, we will use the vari¬ 
able p to represent the magnitude of the spacelike momentum: p = (—p 2 ) 1 / 2 . 
Then we can rewrite the Callan-Symanzik equation as 

[p^r p - + 2 - 2 7 (A)]g (2) Cp) = 0. (12.66) 

In free field theory, 3 and 7 vanish and we recover the trivial result 

G ( 2 ) (p) = 4- (12-67) 

p- 

In an interacting theory, 3 and 7 are nonzero functions of A. However, 
it is still possible to write the explicit solution to the Callan-Symanzik equa¬ 
tion, using the method of characteristics. Equivalently (for those not well 
versed in the theory of partial differential equations), we will apply a lovely 
hydrodynamic-bacteriological analogy due to Sidney Coleman.1 Imagine a 
narrow pipe running in the x direction, containing a fluid whose velocity 

iCallan has given a beautiful proof of the renormalizability of 7 4 theory, based 
on proving that the Callan-Symanzik equation holds order by order in A, in his arti¬ 
cle in Methods in Field Theory, R. Balian and J. Zinn-Justin, eds. (North Holland, 
Amsterdam, 1976). 

tColeman (1985), chap. 3. 
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Figure 12.3. Coleman’s bacteriological analogy to tlie Callan-Symanzik 
equation. Tlie pipe is inhabited by bacteria with a given initial density D,;(. x). 

The growth rate (determined by the illumination) and flow velocity are given 
functions of x. The problem is to determine the density D(t, x) at all subse¬ 
quent times. 

is v(x), as shown in Fig. 12.3. The pipe is inhabited by bacteria, whose den¬ 
sity is D(t,x) and whose rate of growth is p(x). Then the future behavior of 
the function D(t,x) is governed by the differential equation 

O o 

ist +V ^'dx ~ = °' ( 12 . 68 ) 

The second term allows for the fact that the bacteria are swept along with the 
fluid, so their present density here determines their future density not here, 
but some distance ahead. This equation is identical to Eq. (12.66), with the 
replacements 

log (p/M) O t, 

A <->■ x , 

-f3( X)-H-v(x), (12.69) 

2 7 (A )—2 o p{x), 

G (2) {p, A) D(t,x). 

Now suppose we know the initial concentration of the bacteria: D(t,x) = 
Dj(x) at time t = 0. Then we can determine the concentration of bacteria in 
a fluid element at the point x at any later time by computing the history of 
that fluid element and then integrating the rate of growth along that path. 
Consider the fluid element that is at x at the time t. We can find out where 
it was at time zero by integrating its motion backward in time. The position 
of this element at time t = 0 is given by x(t; x), which satisfies the differential 
equation 

—x(t';x) = —v(x), with J-(U: .r) = x. 


( 12 . 70 ) 
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Then, immediately, 

t 

D(t,x ) = Dj (x{t-,x)) • exp | I dt' p{x{t'\x)) 

°, ; (12-71) 

= Di (x(t; x)) ■ exp ^ j dx' ) . 

x(t) 

Now bring this solution back to our field theory problem by replacing 
each bacteriological parameter with its corresponding field theory parameter. 
The time t = 0 corresponds to — p 2 = M 2 , and the initial concentration D t (x) 
becomes an unknown function Q( A). Then 


v =v 

G^ip.X) G(Mp: A)) -exp(- j d\og(p'/M) ■ 2[l - 7 (A(p'; A))]) , (12.72) 

p'=M 

where A (p; A) solves 

5 E5 T IJJ A, K A)=«A), A(A/; A) — A. (12.T3) 

This differential equation describes the flow of a modified coupling constant 
A (p; A) as a function of momentum. The rate of this flow is just the j3 function. 
Thus, this flow is strongly reminiscent of the dependence of the renormalized 
coupling on the renormalization scale given by Eq. (12.64). We will refer to 
A (p) as the running coupling constant. Its equation (12.73) is often called the 
renormalization group equation. 

One can check directly that (12.72) solves the Callan-Symanzik equation 
by using the identity 


dX 1 

Iao 


d\og(p'/M), 


p'=M 


from which it follows that 


A convenient way of writing the solution (12.72) is 


v 

G (2) {p,X) = -^G(X{p;X)) ■ exp ^2 J d\og{p'/M)-/{X{p'; A))) , 

M 


(12.74) 


(12.75) 


(12.76) 


in which Q{ A) is a function that must be determined. This function cannot 
be determined from the general principles of renormalization theory. Instead, 
we must compute G (2 \p) as a perturbation series in A and match terms to 
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the expansion of (12.76) as a series in the same parameter. For the two-point 
function in <p 4 theory, this matching is rather trivial: G(X) = 1 + 0(A~). 

The preceding analysis can be applied to any family of Green’s functions 
that are related by uniform rescaling of the momenta. Consider, for example, 
the connected four-point function of <p 4 theory evaluated at spacelike momenta 
pi such that pj = —P 2 , pi ■ pj = 0, so that s, t , and u are of order — P 1 . To 
leading order in perturbation theory, this function is given by 

G (i \P) = (^) 4 MA). (12.77) 

Using the fact that G (4) has dimensions of (mass) -8 , we can exchange M for 
P in the Callan-Symanzik equation and write this equation as 

[- P-gp - + 8 - 47(A)] G< 4 >(P; A) = 0. (12.78) 

The solution to this equation is 

v 

G (4) (P; A) = (A(p; A)) • exp ^4 f d\og(p'/M) 7 (A(p'; A))) . (12.79) 

M 

This formula must agree with (12.77) to leading order in A; this matching 
requires that 

(A (p; A)) = -iX + 0( A 2 ). (12.80) 

We can now see the physical implication of the Callan-Symanzik equa¬ 
tion. The ordinary Feynman perturbation series for a Green’s function de¬ 
pends both on the coupling constant A and on the dimensionless parameter 
log (—p 2 /M 2 ). The perturbation theory can be badly behaved even when A is 
small if the ratio p 2 /M 2 is large. The solutions (12.76) and (12.79) reorga¬ 
nize this dependence into a function of the running coupling constant and an 
exponential scale factor. We consider these two pieces in turn. 

The first factor in Eqs. (12.76) and (12.79) is a function of the running 
coupling constant, evaluated at the momentum scale p. If p were of order M, 
the renormalization scale, this function would essentially be the ordinary per¬ 
turbative evaluation of the Green’s function. The results (12.76) and (12.79) 
instruct us to make use of this same expression at the scale p, but to replace 
A with a new coupling constant A appropriate to that scale. Thus, the run¬ 
ning coupling constant A (p) is precisely the effective coupling constant of the 
renormalization group flow. This interpretation is particularly clear in the so¬ 
lution (12.79) for G (4) (P), since this function directly measures the strength 
of the cp A coupling constant. 

The exponential factor in Eqs. (12.76) and (12.79) has an equally simple 
interpretation: It is the accumulated field strength rescaling of the correlation 
function from the reference point M to the actual momentum p at which the 
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Green’s function is evaluated. This factor receives a multiplicative contribu¬ 
tion from each intermediate scale between M and p. Each of these contribu¬ 
tions is, appropriately, computed using the running coupling constant at that 
particular scale. 

As a check on these formal arguments, we can use the explicit form of the 
(3 function of ( ft 4 theory found in Eq. (12.46) and the renormalization group 
equation (12.73) to evaluate the running coupling constant of ( ft 4 theory. This 
running coupling constant satisfies the differential equation 


d y _ 3A 2 
d\og(p/M) 167T 2 ’ 


with A(M;A)=A. 


(12.81) 


Integrating, we find 


and thus, 



Mp) 


_A_ 

1 — (3A/167T 2 ) log (p/M)' 


(12.82) 


Many properties of the solution to the Callan-Symanzik equation are vis¬ 
ible in this relation. First, the expansion of this formula for A to order A 2 
agrees precisely with Eq. (12.28), the rate of the renormalization group flow 
from Wilson’s method. Second, this expression for the running coupling con¬ 
stant goes to zero at a logarithmic rate as p —> 0. This coincides with our 
expectation that a positive value for the 3 function should imply an effec¬ 
tive coupling that becomes stronger at large momenta and weaker at small 
momenta. 

If we expand the running coupling constant A (p) in powers of A, we find 
that the successive powers of the coupling constant are multiplied by powers 
of logarithms, 

A n+1 (log p/M) n , 


which become large and invalidate a simple perturbation expansion for p much 
greater or much less than M. We have seen this problem of large logarithms 
arising several times in our diagram calculations, and we have remarked on it 
specifically as a problem in the discussion following Eq. (11.81). We now see 
that the renormalization group gives a partial solution to this problem. In this 
example, and in many others that we will study, the Callan-Symanzik equation 
tells us how to sum these large logarithms into the running coupling constant 
and multiplicative rescalings. If the running coupling constant becomes large, 
as happens in (ft 1 theory for p —> oo, the perturbation expansion will break 
down anyway, and we will need more advanced methods. However, if the 
running coupling constant becomes small, as for (ft 1 theory as p —> 0, we will 
have successfully organized the powers of logarithms into a meaningful and 
controlled expression. The specific problem posed at the end of Section 11.4 
will be solved explicitly by this method in Section 13.2. 
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An Application to QED 


For a more concrete application of the Callan-Symanzik equation, we can look 
again at the electromagnetic potential between static charges, V(x), which we 
studied in Section 7.5. At very short distances or at large momenta, we can 
ignore the electron mass in the computation of QED corrections to this po¬ 
tential. In this approximation, the potential should obey the Callan-Symanzik 
equation of massless QED. We could write this equation either for V(x) it¬ 
self or for its Fourier transform; we choose to work in Fourier space in order 
to make contact more easily with the results of Section 7.5. 

We define the massless limit of QED by specifying a renormalization 
scale M at which the renormalized coupling e r is defined. If M is taken close 
to the electron mass m, at the point where the massless approximation is 
just becoming valid, then the value of e r will be close to the physical elec¬ 
tron charge e. The potential between static charges is a measurable energy, 
so its normalization is unambiguous and is not shifted from one renormal¬ 
ization point to another. Thus the Callan-Symanzik equation for the Fourier 
transform of the potential has no 7 term, being simply 

O rj 

|w— . ](e r )—}v(q;M,e r ) = 0. (12.83) 

The Fourier transform of the potential has dimensions of (mass) -2 , so we 
can trade dependence on M for dependence on q as in the scalar field theory 
discussion above. This gives 

O o 

[qq- - /J(e r ) ^ + 2 ] V(q; M, e r ) = 0. (12.84) 

Equation (12.84) is almost the same as Eq. (12.66), so we can immediately 
write down the solution as a special case of (12.76): 

V(q,e r ) = ^V{e(q;e r )), (12.85) 


where e(q) is the solution of the renormalization group equation 

<ilog(Uf) g(<f;e ' ) = fl(g> ' 5<M; *- ) = e - 
By comparing this formula for V(q) to the leading-order result 

e 2 

V(q) w - 


( 12 . 86 ) 


we can identify V(e) = e 2 + 0(e 4 ). Then 

e 

V(q,e r ) = - 


\q;e r ) 


(12.87) 


up to corrections that are suppressed by powers of e 2 and contain no compen¬ 
satory large logarithms of q/M. 
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To turn Eq. (12.87) into a completely explicit formula, we need only 
solve the renormalization group equation (12.86). Using the QED /3 function 
(12.61), we can integrate (12.86) to find 


This simplifies to 


12tt 2 / 1 



e 2 (q) 


1 - (e”/67T 2 )log (q/M)' 


( 12 . 88 ) 


This result is almost identical to the formula for the effective electric charge 
that we found in Eq. (7.96). To cement the identification, set M to be of order 
the electron mass, M 2 = Am 2 , and approximate e r at this point by e, with 
a = e 2 /47T. Then Eq. (12.88) takes the form 


a(q) 


a 

1 — {a/3ir) log {—q 2 /Am 2 ) 


(12.89) 


The particular choice .4 = exp(5/3) reproduces Eq. (7.96). Of course, we could 
not find this exact correspondence without the detailed one-loop calculation of 
Section 7.5. Nevertheless, our present analysis produces the correct asymptotic 
formula for the effective charge. Furthermore, our present formalism can be 
applied to any renormalizable quantum field theory; it does not rely on the 
special symmetries of QED that we exploited in Section 7.5. 


Alternatives for the Running of Coupling Constants 


Now that we have computed the behavior of the running coupling constant in 
two specific quantum field theories, let us consider more generally what be¬ 
haviors of the running coupling constant are possible in principle. We continue 
to restrict our discussion to renormalizable theories in the massless limit, with 
a single dimensionless coupling constant A. 

By the arguments of the previous section, the Green’s functions in any 
such theory obey a Callan-Symanzik equation. The solution of this equation 
depends on a running coupling constant, A (p), which satisfies a differential 
equation 


d 

d log (p/M) 


A = PCX), 


(12.90) 


in which the function /3(A) is computable as a power series in the coupling 
constant. In the examples we have just discussed, the leading coefficient in this 
power series was positive. However, as a matter of principle, three behaviors 
are possible in the region of small A: 


(1) /3(A) > 0; 

(2) /3(A) =0; 

(3) /3(A) <0. 
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Examples of quantum fields are known that exhibit each of these behaviors. 

We have already seen how, in theories of the first class, the running cou¬ 
pling constant goes to zero in the infrared, leading to definite predictions about 
the small-momentum behavior of the theory. However, the running coupling 
constant becomes large in the region of high momenta. Thus the short-distance 
behavior of the theory cannot be computed using Feynman diagram pertur¬ 
bation theory. In fact, in the examples studied above, the coupling constant 
formally goes to infinity at a large but finite value of the momentum; thus it is 
not even clear that these theories possess a nontrivial limit A —» oo. A Feyn¬ 
man diagram analysis is useful in such theories if one is mainly interested in 
large-distance or macroscopic behavior. In Chapter 13 we will use this obser¬ 
vation to solve problems in the statistical mechanics of systems with critical 
points. 

In theories of the second class, the coupling constant does not flow. In 
these theories, the running coupling constant is independent of the momen¬ 
tum scale, and thus equal to the bare coupling. This means that there can be 
no ultraviolet divergences in the relation of coupling constants. The only pos¬ 
sible ultraviolet divergences in such theories are those associated with field 
rescaling, which automatically cancel in the computation of S'-matrix ele¬ 
ments. Such theories are called finite quantum field theories. Before the emer¬ 
gence of our modern understanding of renormalization, these theories would 
have been embraced as the solution to the problem of ultraviolet infinities. 
But in fact the known finite field theories in four dimensions are very special 
constructions—the so-called gauge theories with extended supersymmetry— 
with no known physical application. 

In theories of the third class, the running coupling constant becomes large 
in the large-distance regime and becomes small at large momenta or short 
distances. Imagine, for instance, that the sign of the QED j3 function were 
reversed: 

3(e) = -\Ce z . (12.91) 


Then, following our earlier analysis, we would have 


e 2 (p) 


1 + Ce' 2 log (p/M) 


(12.92) 


This coupling constant tends to zero at a logarithmic rate as the momentum 
scale increases. Such theories are called asymptotically free. In theories of this 
class, the short-distance behavior is completely solvable by Feynman diagram 
methods. Though ultraviolet divergences appear in every order of perturbation 
theory, the renormalization group tells us that the sum of these divergences is 
completely harmless. If we interpret these theories in terms of a bare coupling 
eb and a finite cutoff A, the result (12.92) indicates that there is a smooth 
limit in which e*, tends to zero as A tends to infinity. Thus, asymptotically 
free theories give another, more sophisticated, resolution of the problem of 
ultraviolet divergences. In Chapter 17, we will see that asymptotic freedom 
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plays an essential role in the formulation of a field theory that describes the 
strong interactions of elementary particle physics. 

Now that we have enumerated the possibilities for the renormalization 
group flow in the region of weak coupling, let us turn our attention to the 
region of strong coupling. Here we will not be able to compute the 0 function 
quantitatively, but we can at least use the renormalization group equation 
to discuss qualitatively the possibilities for the coupling constant flow. All of 
our explicit solutions for running coupling constants—Eqs. (12.82), (12.88), 
and (12.92)—predict that the running coupling becomes infinite at a finite 
value of the momentum p. For example, according to Eq. (12.82), the running 
coupling constant of <p 4 theory should diverge at 

p ~ M exp (~^ _ ) • (12.93) 

It is possible that this is the true behavior of the quantum field theory, but we 
have not proved this, because when the running coupling constant becomes 
large, the approximation we have made, ignoring the higher-order terms in the 
j3 function, is no longer valid. It is a logical possibility that the higher terms 
of the 0 function are negative, so that the /3 function has the form shown 
in Fig. 12.4(a). In this case the 0 function has a zero at a nonzero value A*. 
When A approaches this value, the renormalization group flow slows to a halt; 
thus A = A* would be a nontrivial fixed point of the renormalization group. In 
this model, the running coupling constant A tends to A* in the limit of large 
momentum. 

For the specific case of ( ft 4 theory in four dimensions, we have strong 
evidence from numerical studies that there is no such nontrivial fixed point. 
However, we will soon demonstrate that there is a nontrivial fixed point in 0 4 
theory in d < 4, and many more examples are known. It is thus worthwhile 
to explore the implications of a fixed point in the renormalization group flow. 

For a 0 function of the form of Fig. 12.4(a), the 0 function behaves in the 
vicinity of the fixed point as 


0* 

i —B(X - A*), 

(12.94) 

where B is a positive constant. For A near A*, 


d 

Aw-B(A-A.). 

(12.95) 

dlogp 

The solution of this equation is 




/ M \ B 


X(p) = 

= a * + c ( 7 ) ' 

(12.96) 


Thus, A indeed tends to A* as p —> oo, and the rate of approach is governed 
by the slope of the 0 function at the fixed point. 

This behavior has a dramatic consequence for the exact solution (12.72) 
of the Callan-Symanzik equation for G(p). For p sufficiently large, the integral 
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Figure 12.4. Possible forms of the /3 function with nontrivial zeros: 
(a) ultraviolet-stable fixed point; (b) infrared-stable fixed point. 


in the exponential factor in this equation will be dominated by values of p for 
which A (p) is close to A*. Then 


G(p) 


£(A*)exp -(log^)-2(l- 7 (A.)) 


(12.97) 


Thus the two-point correlation function returns to the form of a simple scal¬ 
ing law, but with a power law different from that expected by dimensional 
analysis. At the fixed point we have a scale-invariant quantum field theory in 
which the interactions of the theory affect the law of rescaling. The shift of 
the exponent y(A*) is called the anomalous dimension of the scalar field. By 
convention, the function 7 (A) is often called the anomalous dimension even if 
there is no fixed point in the theory. 

A similar behavior is possible in an asymptotically free theory. If the 0 
function has the form shown in Fig. 12.4(b), the running coupling constant 
will tend to a fixed point A* as p —> 0. The two-point correlation function 
of fields G(p) will tend to a power law as in (12.97) for asymptotically small 
momenta. The two cases shown in Figs. 12.4(a) and (b) are called, respectively, 
ultraviolet-stable and infrared-stable fixed points. 

In the previous section, we saw that the leading-order expressions for 
the Callan-Symanzik functions 0 and 7 are related in a simple way to the 
ultraviolet divergent parts of the one-loop counterterms. However, we noted 
that, in higher orders of perturbation theory, 0 and 7 depend on the specific 
renormalization conventions used to define the Green’s functions. Still, there 
are some properties of these functions that are independent of any convention. 
The coefficient of the logarithm in the denominator of such expressions as 
(12.82) or (12.89) can be determined unambiguously from experiments that 
measure this coupling constant. This confirms the convention independence of 
the first /3-function coefficient. Experiments sensitive to the coupling constant 
can also determine the existence of a zero of the 0 function at strong coupling, 
and the rate of approach to this asymptote. Thus the existence of a zero of 
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the f3 function (but not necessarily the value of A*), the slope B at the zero, 
and the value of the anomalous dimension at the fixed point should all be 
independent of the conventions used to compute 6 and 7 . 


12.4 Renormalization of Local Operators 


The analysis of the previous two sections has been restricted to quantum field 
theories with only dimensionless coefficients, that is, strictly renormalizable 
field theories in the massless limit. It is not difficult to generalize this for¬ 
malism to theories with mass terms and other operators whose coefficients 
have mass dimension. However, it is worthwhile to first devote some attention 
to an intermediate step, by analyzing the renormalization group properties 
of matrix elements of local operators. This is an interesting problem in its 
own right, and we will devote considerable space to the applications of this 
formalism in Chapter 18. 

Matrix elements of local operators appear often in quantum field theory 
calculations. Typically one considers a set of interacting particles that couple 
weakly to an additional particle, which mediates new forces. Consider, for 
example, the theory of strongly interacting quarks perturbed by the effects of 
weak decay processes. The weak interaction is mediated by a massive vector 
boson, the W. Let us write the interaction of the quarks with the W very 
schematically as 

S<C = JLw^(l- 1 5 )iP, (12.98) 


and assign the W boson the propagator 

q 2 ~ m 2 w + ’ 


(12.99) 


(We will discuss this interaction more correctly in Section 18.2 and in Chap¬ 
ter 20.) Exchange of a W boson leads to the interaction shown in Fig. 12.5. 
For momentum transfers small compared to rriw , we can ignore the q 2 in the 
W propagator and write this interaction as the matrix element of the operator 


„ g , 0(x) 1 where 0(x) = ip^(l — r y 5 )ip ^> 7 M (1 — 7 5 )$- ( 12 . 100 ) 


In the spirit of Wilson’s renormalization group procedure, we can say that, on 
distance scales larger than m^ 1 , the W boson can be integrated out, leaving 
over the interaction ( 12 . 100 ). 

How would we analyze the effects of the operator (12.100) on strongly 
interacting particles composed of quarks and antiquarks? A useful way to 
begin is to compute the Green’s function of the operator O together with 
fields that create and destroy quarks. If we approximate the theory of quarks 
by a theory of free fermions, it is easy to compute these Green’s functions; for 
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Figure 12.5. Interaction of quarks generated by the exchange of a W boson. 


example: 

(p(pi)p(-p 2 )lMp3)lp(-p4)O(0)) 

= Sf(pi) 7"(1 - 7 5 )S'f(po) S F (p 3 ) 7 m (1 - 7 r, )S/.-(/q). 


( 12 . 101 ) 


However, in an interacting field theory, the answer will be much more compli¬ 
cated. Some of these complications will involve the low-energy interactions of 
quarks, and we will leave them outside of the present discussion. However, in 
a renormalizable theory of quark interactions, one will also find that Green’s 
functions containing O have new ultraviolet divergences. The one-loop correc¬ 
tions to ( 12 . 101 ) will contain diagrams that evaluate to the right-hand side of 
(12.101) times a divergent integral. These diagrams can be interpreted as field 
strength renormalizations of the operator O. As with correlation functions of 
elementary fields, we can obtain finite and well-defined matrix elements of 
local operators only if we establish conventions for the normalization of lo¬ 
cal operators and introduce operator rescalings in the form of counterterms, 
order by order in perturbation theory, to preserve these conventions. More 
specifically, in a massless, renormalizable field theory of the fermions ip, we 
should make the convention that Eq. (12.101) is exact at some spacelike nor¬ 
malization point for which p\ = p\ = p\ = p\ = —M 2 . Then we should add a 
counterterm of the form SoO(x), and adjust this counterterm at each order 
of perturbation theory to insure that these relations are preserved. We refer 
to the operator satisfying the normalization condition ( 12 . 101 ) at M 2 as Om- 
The renormalized operator Om is a rescaled version of the operator Oo 
built of bare fields, 

£>o(x) = -^o7"(l - T'^/WoT^ 1 - 7 5 )^o- (12 .102) 

As we did for the elementary fields, we can write this relation as 

O 0 = Z 0 (M)O m . (12.103) 

This allows us to write the generalization of the relation (12.35) between 
Green’s functions of bare and renormalized fields. Let us return to the lan¬ 
guage of scalar field theories and consider O(x) to be a local operator in a 
scalar field theory. Define 

G (n ' 1] (pi, ■ ■ ■ ,p n ;k ) = (<t>(pi) ■ ■ ■ <t>(p n )0 M (k)) ■ 


(12.104) 
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Then G (n;1) is related to a Green’s function of bare fields by 

G in) (p U ‘ ■ ■ ,p„; k) = Z{M)~ n / 2 Zo(M)- 1 (Mpi) ■ ■ ■ M'Pn)O 0 (k)). 

(12.105) 

Repeating the derivation of Eqs. (12.63) and (12.64), we find that the Green’s 
functions containing a local operator obey the Callan-Symanzik equation 

[ m ^m + m ~k +n7(A) +^( A )] G(n) = °. ( 12 - 106 ) 

where 

lv=M ^M l ° g Zo(M). (12.107) 

It often happens that a quantum field theory contains several operators 

with the same quantum numbers. For example, in quantum electrodynamics, 
the operators + ^D^]ip and F^ X F V \ are both symmetric tensors with 

zero electric charge; in addition, both operators have mass dimension 4. Such 
operators, with the same quantum numbers and the same mass dimension, 
can be mixed by quantum corrections.* For such a set of operators {(9’}, the 
relation of renormalized and bare operators must be generalized to 

0* = Zg(M)& J M . (12.108) 

This relation in turn implies that the anomalous dimension function 70 in 

the Callan-Symanzik equation must be generalized to a matrix, 

= [ZoHM)] ik M-^[Z 0 (M)] k F (12.109) 

Most of our applications of (12.106) in Chapter 18 will require this general¬ 
ization. 

On the other hand, there are some operators for which the rescaling and 
anomalous dimensions are especially simple. If O is the quark number current 
i/jy'b/j, its normalization is fixed once and for all because the associated charge 

Q = J 

is just the conserved integer number of quarks minus antiquarks in a given 
state. More generally, for any conserved current J M , Zj(M ) = 1 and jj = 0. 
The same argument applies to the energy-momentum tensor. Thus, in the 
QED example above, the specific linear combination 

T"" = ^[7"D" + 7^ F' lX F \ (12.110) 

receives no rescaling and no anomalous dimension. This linear combination of 
operators must be an eigenvector of the matrix y* J with eigenvalue zero. 


*Our assumption that we are working in a massless field theory constrains the pos¬ 
sibilities for operator mixing. In a massive field theory, operators of a given dimension 
can also mix with operators of lower dimension. 
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So far, our discussion of operator matrix elements has been rather ab¬ 
stract. To make it more concrete, we will construct a formula for computing 
7 o to leading order from one-loop counterterms, and then apply this formula 
to a simple example in ( ft 4 theory. 

To find a simple formula for 7 q , we follow the same path that took us from 
Eq. (12.52) to the formula (12.53) for the j3 function. Consider an operator 
whose normalization condition is based on a Green’s function with m scalar 
fields: 

G (m;1) = (4>('Pi) ■ ■ ■ <p(p m )0 M (k)) . (12.111) 

To compute this Green’s function to one-loop order, we find the set of dia¬ 
grams: 


The last diagram is the counterterm So needed to maintain the renormal¬ 
ization condition. Notice that the counterterm Sz also appears. If we insist 
that this sum of diagrams satisfies the Callan-Symanzik equation (12.106) to 
leading order in A, we find, analogously to (12.53), the relation 

70 (A) = mA (-fo + jSz) . (12.112) 

As a specific example of the use of this formula, let us compute the anoma¬ 
lous dimension 7 q of the mass operator cfr in <f> 4 theory. There is a small 
subtlety involved in this computation. The Feynman diagrams of <p 4 theory 
generate an additive mass renormalization, which must be removed by the 
mass counterterm at each order in perturbation theory. We would like to de¬ 
fine the mass operator as a perturbation which we can add to the massless 
theory defined in this way. To clarify the distinction between the underlying 
mass, which is renormalized to zero, and the explicit mass perturbation, we 
will analyze a Green’s function of cp 2 in which this operator carries a specific 
nonzero momentum. We thus choose to define the normalization of <p 2 by the 
convention 


= {4>{p)4>{q)<jr {k)) = ^ 7^7 • 2 (12.113) 


at p 2 = q 2 = k' 2 = —M 2 . 
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The one-particle-irreducible one-loop correction to (12.113) is 


= j_J_ f d 4 r i i 

p 2 q 2 / (27t) 4 1 r 2 (k + r) 2 

(12.114) 

= 1 1 \ A F ( 2 ~f) l 

p 2 q 2 I. (47t) 2 A 2 ~ d / 2 J ’ 

where A is a function of the external momenta. At —M 2 , this contribution 
must be canceled by a counterterm diagram, 


Thus, the counterterm must be 


p 2 q 2 


2 Sa 


A T(2-f) 
V 2(4 tt) 2 (M 2 ) 2 ~ d / 2 ' 


(12.115) 


(12.116) 


Since Sz is finite to order A, this is the only contribution to (12.112), and we 
find 

w = ib- (12117 > 

This function can be used together with the 7 and 3 functions of pure massless 
(/> 4 theory to discuss the scaling of Green’s functions that include the mass 
operator. 


12.5 Evolution of Mass Parameters 

Finally, we discuss the renormalization group for theories with masses. We 
note, though, that although we treat these masses as arbitrary parameters, 
we will continue to use renormalization conventions that are independent of 
mass, and we will often treat the masses as small parameters. This approach 
breaks down at momentum scales much less than the scale of masses, but 
it is sufficient, and simpler than alternative approaches, for most practical 
applications of the renormalization group. 

In the previous section, we worked out the scaling of Green’s functions 
containing one power of the mass operator. It is a small step to generalize 
this discussion to include an arbitrary number ( of mass operators; one sim¬ 
ply finds the equation (12.106) with the coefficient l in front of the term 70 . 
Now consider what would happen if we add the mass operator directly to the 
Lagrangian of the massless <f> 4 theory, treating this operator as a perturba¬ 
tion. If Cm is the massless Lagrangian renormalized at the scale M, the new 
Lagrangian will be 


Cm T 2 m - 07 j , 


( 12 . 118 ) 
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The Green’s function of n scalar fields in the theory (12.118) could be ex¬ 
pressed as a perturbation series in the mass parameter m ' 2 . The coefficient 
of (m 2 ) f ’ would be a joint correlation function of the n scalar fields with 
l powers of (jr M , and would therefore satisfy the Callan-Symanzik equation 
(12.106) with the extra factor l as noted above. In general, we can use the 
operator m 2 (d/dm 2 ) to count the number of insertions l of <p 2 . Then the 
Green’s functions of the massive ( j ) 4 theory, renormalized according to the 
mass-independent scheme, satisfy the equation 

O rj O 

[m ^ +/^(A) — + n 7 (A) + 7 ^ 2 m 2 ^^] G (n) ({p, : }; M, A,m 2 ) = 0. (12.119) 

This argument extends to any perturbation of massless <j> 4 theory. In the 
general case, 

£(Cj) = C M + CiO' M (x), (12.120) 

and the Green’s functions of this perturbed theory satisfy 

o o o 

[ M dM +m d\ +nj{X) + 5>(A)G,;— ]g ( ^({p, ; };M,A,{G,;}) = o. 

( 12 . 121 ) 

To interpret this equation, it will help to make a slight change to bring 
the notation in line with our new viewpoint. Let dj be the mass dimension of 
the operator O l . Then rewrite (12.120) by representing each coefficient Cj as 
a power of M and a dimensionless coefficient pi : 

J0(pi) = Cm + Pi M A ~ di Oitix). (12.122) 

The size of each p, : indicates the importance of the corresponding operator at 
the scale M. This new convention introduces further explicit M dependence 
into the Green’s functions, which is compensated by a rescaling of the p,;. 
Thus (12.121) must be modified to 

| ^ + n 7 + IfbW + di - 4] P* ^ 7 ] G {n) {{pi}; M, A, {p,:}) = 0. 

(12.123) 

The meaning of this equation becomes clearer if we define 

:V ('/, l+',)pp (12.124) 


Then 

[ m jm + p ~k + ^ l3i Wi + H G< " )(fe};M ’ A ’ {Pi}) = °- (12 - 125) 

Now all of the coupling constants p,: appear on the same footing as A. We can 
solve this generalized Callan-Symanzik equation using the same method as in 
Section 12.3, by introducing bacteria, which now live in a multidimensional 
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velocity field (3,3i)- The solution will depend on a set of running coupling 
constants which obey the equations 


d 

d\og(p/M) Pt 


3i(p, A). 


(12.126) 


It is interesting to examine this flow of coupling constants for the case 
where all the dimensionless parameters A, p- t are small, so that we are close 
to the free scalar field Lagrangian. In this situation, we can ignore the contri¬ 
bution of 7 ,: to 3i\ then 


d 

d\og(p/M) Pi 


[d’i -4 + • • •] Pi- 


(12.127) 


The solution to this equation is 


(m28) 

Operators with mass dimension greater than 4, corresponding to nonrenor- 
malizable interactions, become less important as a power of p as p —> 0. This 
is exactly the behavior that we found in Eq. (12.27) using Wilson’s method. 
Since we have now generalized the Callan-Symanzik equation to incorporate 
the most general perturbation of the free-field Lagrangian, it is pleasing that 
we recover the full structure of the Wilson flow of coupling constants. In ad¬ 
dition, this more formal method gives us a way to compute the corrections to 
the Wilson flow due to A 4 > 4 interactions, order by order in A, using Feynman 
diagrams. 

We can move one step closer to the generality of Section 12.1 by moving 
from four dimensions to an arbitrary dimension d. We require only two small 
changes in the formalism. First, the operator <f > 4 acquires a dimensionful co¬ 
efficient when d yf 4, and we must take account of this. We have seen in the 
discussion below Eq. (10.13) that a scalar field has mass dimension (d — 2)/2. 
Thus, the operator < i > 4 has mass dimension (2d — 4), and so its coefficient has 
dimension 4 — d. To implement the renormalization group, we redefine A so 
that this coefficient remains dimensionless in d dimensions. We treat the mass 
term similarly, replacing to 2 — > p m M' 2 . Thus the expansion of the Lagrangian 
about the free scalar field theory Co reads: 

C = Co- + ■■■■ (12.129) 


The second required change in the formalism is that of recomputing the 
3 and 7 functions in the new dimension. To order A, the result is surprisingly 
innocuous. Consider, for example, the computation of 7 ^ 2 , Eq. (12.114). This 
computation, which was performed in dimensional regularization, is essentially 
unchanged. For general values of d, the derivative of the counterterm 8^2 with 
respect to log M still involves the factor 


M 


d M 


f r(2 -I) \ 
V (M' 2 y 2 ~ d /' 2 ) 


2 + 0(4 — d). 


(12.130) 
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This observation holds for all of the 7 and the , 3 function is shifted only by 
the contribution of the mass dimension of A. Thus, for d near 4, 

13 = (d-4)A + /l (4) (A) + ---, 

0m = [-2 + Ti'tVm + • • •, (12.131) 

f3i = [di — d + 7) ^\pi + • • • , 

where the functions with a superscript (4) are the four-dimensional results 
obtained earlier in this section, and the omitted correction terms are of order 
X-(d— 4). The precise form of these corrections depends on the renormalization 
schemed 

Using the explicit four-dimensional result (12.46) for (3, we now find 

3A 2 

$ = - [A -d)\+—. (12.132) 

For d > 4, this function is positive and predicts that the coupling constant 
flows smoothly to zero at large distances. However, when d < 4, this /1(A) has 
the form shown in Fig. 12.4(b). Thus it generates just the coupling constant 
flow that we discussed from Wilson’s viewpoint below Eq. (12.29). At small 
values of A, the coupling constant increases in importance with increasing 
distance, as dimensional analysis predicts. However, at larger A, the coupling 
constant decreases as a result of its own nonlinear effects. These two tendencies 
come into balance at the zero of the beta function, 

A* = -^—(4 — d), (12.133) 

which gives a nontrivial fixed point of the renormalization group flows in scalar 
field theory for d < 4. If we formally consider values of d close to 4, this fixed 
point occurs in a region where the coupling constant is small and we can use 
Feynman diagrams to investigate its properties. This fixed point, which was 
discovered by Wilson and Fisher,! has important consequences for statistical 
mechanics, which we will discuss in Chapter 13. 

Critical Exponents: A First Look 

As an application of the formalism of this section, let us calculate the renor¬ 
malization group flow of the coefficient of the mass operator in < i> 4 theory. This 
is found by integrating Eq. (12.126), using the value of j3 m from (12.131): 

= [-2 +% 2 (A )]p m . (12.134) 

fThis expansion is displayed to rather high order in E. Brezin, J. C. Le Gillou, 
and J. Zinn-Justin, Phvs. Rev. D9, 1121 (1974). 

IK. G. Wilson and M. E. Fisher, Phvs. Rev. Lett. 28, 240 (1972). 
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For A = 0, this equation gives the trivial relation 



(12.135) 


If we recall that we originally defined p m = m?/M 2 , this is just a complicated 
way of saying that, when p becomes of order m, the mass term becomes an 
important term in the Lagrangian. At this point, the correlations in the <j> 
field begin to die away exponentially. The characteristic range of correlations, 
which in statistical mechanics would be called the correlation length £, is given 
by 

£ ~ Po \ where p m (po) = 1. (12.136) 


If we evaluate this criterion, we find £ ~ (M 2 p m ) -1 / 2 , that is, p ~ m -1 , as 
we would have expected. 

However, the application of this criterion at the fixed point A* gives a 
much more interesting result. If we set A = A*, then Eq. (12.134) has the 
solution 


Pm = Pm J 


(12.137) 


This gives a nontrivial relation 


£ ~ 


(12.138) 


where the exponent v is given formally by the expression 

1 

2 - 7<a 2 (A*) ' 


(12.139) 


Using the results (12.133) and (12.117), we can evaluate this explicitly for d 
near 4: 

v- 1 = 2- \(4-d). (12.140) 

o 

Wilson and Fisher showed that this expression can be extended to a systematic 
expansion of v in powers of e = (4 — d). 

Because the exponent v has an interpretation in statistical mechanics, it is 
directly measurable in the realistic case of three dimensions. In the statistical 
mechanical interpretation of scalar field theory, p m is just the parameter that 
one must adjust finely to bring the system to the critical temperature. Thus p m 
is proportional to the deviation from the critical temperature, (T — Tc). Our 
field theoretic analysis thus implies that the correlation length in a magnet 
grows as T —»■ Tc according to the scaling relation 


(T-To)~ v . 


(12.141) 


It also gives a definite, and somewhat unusual, prediction for the value of v. 
It predicts that v is close to the value 1/2 suggested by the Landau approx¬ 
imation studied in Chapter 8 (Eq. (8.16)), but that v differs from this value 
by some systematic corrections. 
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A scaling behavior of the type (12.141) is observed in magnets, and it is 
known that several definite scaling laws occur, depending on the symmetry of 
the spin ordering. Magnets can be characterized by the number of fluctuating 
spin components: N =1 for magnets with a preferred axis, N =2 for magnets 
with a preferred plane, and 1V=3 for magnets that are isotropic in three- 
dimensional space. The experimental value of v depends on this parameter. 
The cp 4 field theory discussed in this chapter contained only one fluctuating 
field; this is the analogue of a magnet with one spin component. In Chapter 11, 
we considered a generalization of cp 4 theory to a theory of N fields with O(N) 
symmetry. We might guess that this system models magnets of general N. 

If this correspondence is correct, Eq. (12.140) gives a prediction for the 
value of v in magnets with a preferred axis. In Section 13.1, we will repeat 
the analysis leading to this equation in the ()(\) symmetric (f> 4 theory and 
derive the formula 

i IV T 2 

^= 2 - — T8 (4 “ d) ’ (12 - 142) 

valid for general N to first order in (4 — d). For the cases N = 1,2,3 and 
d = 3, this formula predicts 

v = 0.60, 0.63, 0.65. (12.143) 

For comparison, the best current experimental determinations of v in magnetic 
systems give* 

v = 0.64, 0.67, 0.71 (12.144) 

for N = 1,2,3. The prediction (12.143) gives a reasonable first approximation 
to the experimental results. 

The ability of quantum field theory to predict the critical exponents gives 
a concrete application both of the formal connection between quantum field 
theory and statistical mechanics and of the flows of coupling constants pre¬ 
dicted by the renormalization group. However, there is another experimental 
aspect of critical behavior that is even more remarkable, and more persua¬ 
sive. Critical behavior can be studied not only in magnets but also in fluids, 
binary alloys, superfluid helium, and a host of other systems. It has long been 
known that, for systems with this disparity of microscopic dynamics, the scal¬ 
ing exponents at the critical point depend only on the dimension N of the 
fluctuating variables and not on any other detail of the atomic structure. 
Fluids, binary alloys, and uniaxial magnets, for example, have the same crit¬ 
ical exponents. To the untutored eye, this seems to be a miracle. But for a 
quantum field theorist, this conclusion is the natural outcome of the renor¬ 
malization group idea, in which most details of the field theoretic interaction 
are described by operators that become irrelevant as the field theory finds its 
proper, simple, large-distance behavior. 


*For further details, see Table 13.1 and the accompanying discussion. 
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Problems 

12.1 Beta functions in Yukawa theory. In the pseudoscalar Yukawa theory stud¬ 
ied in Problem 10.2, with masses set to zero, 

£ = \(d»<t>f - 

compute the Callan-Svmanzik j3 functions for A and g: 

P\ P g (\g), 

to leading order in coupling constants, assuming that A and g 2 are of the same order. 
Sketch the coupling constant flows in the A -g plane. 

12.2 Beta function of the Gross-Neveu model. Compute 0(g) in the two- 
dimensional Gross-Neveu model studied in Problem 11.3, 

C = i’iifhpi + §g 2 (4’i4’i) 2 , 

with i = 1, ..., N. You should find that this model is asymptotically free. How was 
that fact reflected in the solution to Problem 11.3? 

12.3 Asymptotic symmetry. Consider the following Lagrangian, with two scalar 
fields <j >i and <j> o: 

£ = n4>iY + {dpto)") — ^y(0i + 4>i) ~ 

Notice that, for the special value p = A, this Lagrangian has an 0(2) invariance rotating 
the two fields into one another. 

(a) Working in four dimensions, find the /3 functions for the two coupling constants 
A and p, to leading order in the coupling constants. 

(b) Write the renormalization group equation for the ratio of couplings p/X. Show 
that, if p/X < 3 at a renormalization point M, this ratio flows toward the 
condition p = A at large distances. Thus the 0(2) internal symmetry appears 
asymptotically. 

(c) Write the f3 functions for A and p in 4 — e dimensions. Show that there are 
nontrivial fixed points of the renormalization group flow at p/X = 0,1, 3. Which 
is the most stable? Sketch the pattern of coupling constant flows. This flow 
implies that the critical exponents are those of a symmetric two-component 
magnet. 
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Critical Exponents and Scalar Field Theory 


The idea of running coupling constants and renormalization-group flows gives 
us a new language with which to discuss the qualitative behavior of scalar 
field theory. In our first discussion of 0 4 theory, each value of the coupling 
constant—and, more generally, each form of the potential and each spacetime 
dimension—gave a separate problem to be explored. But in Chapter 12, we 
saw that <p 4 theories with different values of the coupling are connected by 
renormalization-group flows, and that the pattern of these flows changes con¬ 
tinuously with the spacetime dimension. In this context, it makes sense to ask 
the very general question: How does (f> 4 theory behave as a function of the 
dimension? This chapter will give a detailed answer to this question. 

The central ingredient in our analysis will be the Wilson-Fisher fixed point 
discussed in Section 12.5. This fixed point exists in spacetime dimensions d 
with d < 4; in those dimensions it controls the renormalization group flows of 
massless <j> 4 theory. The scalar field theory has manifest or spontaneously bro¬ 
ken symmetry according to the sign of the mass parameter m 2 . Near to 2 = 0, 
the theory exhibits scaling behavior with anomalous dimensions whose val¬ 
ues are determined by the renormalization group equations. For d > 4, the 
Wilson-Fisher fixed point disappears, and only the free-held fixed point re¬ 
mains. Again, the theory exhibits two distinct phases, but now the behavior 
at the transition is determined by the renormalization group hows near the 
free-held hxed point, so the scaling laws are those that follow from simple 
dimensional analysis. 

The continuation of these results to Euclidean space has important im¬ 
plications for the theory of phase transitions in magnets and fluids. As we 
discussed in the previous chapter, the ideas of the renormalization group im¬ 
ply that the power-law behaviors of thermodynamic quantities near a phase 
transition point are determined by the behavior of correlation functions in a 
Euclidean cj) 4 theory. The results stated in the previous paragraph then im¬ 
ply the following conclusions for critical scaling laws: For statistical systems 
in a space of dimension d > 4, the scaling laws are just those following from 
simple dimensional analysis. These predictions are precisely those of Landau 
theory, which we discussed in Chapter 8. On the other hand, for d < 4, the 
critical scaling laws are modified, in a way that we can compute using the 
renormalization group. 


439 
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In d = 4, we are on the boundary between the two types of scaling behav¬ 
ior. This corresponds to the situation in which < 4 theory is precisely renor- 
malizable. In this case, the dimensional analysis predictions are corrected, but 
only by logarithms. We will analyze this case specifically in Section 13.2. 

Though it is not obvious, the case d = 2 provides another boundary. Here 
the transition to spontaneous symmetry breaking is described by a different 
quantum field theory, which becomes renormalizable in two dimensions. In 
Section 13.3, we will introduce that theory, called the nonlinear sigma model, 
and show how its renormalization group behavior merges smoothly with that 
of ( ft 4 theory. By combining all of the results of this chapter, we will obtain 
a quantitative understanding of the behavior of <p A theory, and of critical 
phenomena, over the whole range of spacetime dimensions. 

13.1 Theory of Critical Exponents 

At the end of Chapter 12, we used properties of the renormalization group 
for scalar field theory to make a prediction about the behavior of correla¬ 
tions near the critical point of a thermodynamic system. We argued that the 
range of correlations, the correlation length £, should increase to infinity as 
one approaches the critical point, according to the scaling law (12.141). The 
exponent in this equation, called v, should depend only on the symmetry of 
the order parameter. We argued, further, that this exponent is related to the 
anomalous dimension of a local operator in < t> 4 theory, and that it can be 
computed from Feynman diagrams. In this section, we will show that similar 
conclusions apply more generally to a large number of scaling laws associated 
with a critical point. 

To begin, we will define systematically a set of critical exponents, expo¬ 
nents of scaling laws that describe the thermodynamic behavior in the vicinity 
of the critical point. We will then show, using the Callan-Symanzik equation, 
that all these exponents can be reduced to two basic anomalous dimensions. 
Finally, we will compare this remarkable prediction of quantum field theory 
to experiment. 

In suggesting a set of critical scaling laws, we begin with the behavior of 
the correlation function of fluctuations of the ordering field. For definiteness, 
we will use the language appropriate to a magnet, as in Chapter 8. We will 
compute classical thermal expectation values as correlation functions in a 
Euclidean quantum field theory, as explained in Section 9.3. The fluctuating 
field will be called the spin field s(x), its integral will be the magnetization M, 
and the external field that couples to s(x) will be called the magnetic field H. 
(In deference to the magnetization, we will denote the renormalization scale 
in the Callan-Symanzik equation by p in this section.) 

Define the two-point correlation function by 

G(x) = (s(x)s( 0)), (13.1) 

or by the connected expectation value, if we are in the magnetized phase where 
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(s(x)) ^ 0. Away from the critical point, G(x ) should decay exponentially, 
according to 

G(x) ~ exp[—1*]/£]. (13.2) 


The approach to the critical point is characterized by the parameter 


t = 


T - T c 
T c 


(13.3) 


Then we expect that, as t —1 0, the correlation length should increase to 
infinity. Define the exponent v, (12.141), by the formula 


£ ~ \t\ "■ 


(13.4) 


Just at t = 0, the correlation function should decay only as a power law. 
Define the exponent ij by the formula 

G <*> ~ ypW < 13 - 5 ) 

where d is the Euclidean space dimension. 

The behaviors of thermodynamic quantities near the critical point define 
a number of additional exponents. Typically, the specific heat of the thermo¬ 
dynamic system diverges as t —» 0 ; define the exponent a by the formula for 
the specific heat at fixed external field H = 0: 


C H ~ |C“ (13-6) 

Since the ordering sets in at t = 0, the magnetization at zero field tends to 
zero as t —» 0 from below. Define the exponent 6 (not to be confused with the 
Callan-Symanzik function) by 

M ~ |i | 13 . (13.7) 

Even at t = 0 one has a nonzero magnetization at nonzero magnetic field. 
Write the law by which this magnetization tends to zero as H —> 0 as the 
relation 

M ~ H 1 / 6 . (13.8) 


Finally, the magnetic susceptibility diverges at the critical point; we write this 
divergence as the relation 

(13.9) 

Equations (13.4)-(13.9) define a set of critical exponents a, (3, 7 , <5, v , /?, which 
can be measured experimentally for a variety of thermodynamic systems.* 

In Chapter 12 we argued, following Wilson, that a thermodynamic system 
near its critical point can be described by a Euclidean quantum field theory. 
At the level of the atomic scale, the Lagrangian of this quantum field theory 
may be complicated; however, when we have integrated out the small-scale 


*A variety of further critical exponents and relations are presented in M. E. Fislier, 
Repts. Prog. Phvs. 30, 615 (1967). 
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degrees of freedom, this Lagrangian simplifies. If we adjust a parameter of the 
theory to insure the presence of long-range correlations, the Lagrangian must 
closely approach a fixed point of the renormalization group. Generically, the 
Lagrangian will approach the fixed point with a single unstable or relevant 
direction, corresponding to the mass parameter of <p 4 theory. In d < 4, this 
is the Wilson-Fisher fixed point. In d > 4, it is the free-field fixed point. For 
definiteness, we will assume d < 4 in the following discussion. 


Exponents of the Spin Correlation Function 

In this setting, we can study the behavior of the spin-spin correlation function 
G(x). By the argument just reviewed, G(x) is proportional to the two-point 
correlation function of a Euclidean scalar field theory. The technology intro¬ 
duced in the previous chapter can be applied directly. The correlation function 
obeys the Callan-Symanzik equation (12.125), 

O o 

+ + = 0- (13.10) 

Here we include the <f> 4 coupling A among the generalized couplings pi. 

By dimensional analysis, in d dimensions, 

G(x) = 2 1/0'-''I-{/'- })• (13.11) 

where g is an arbitrary function of the dimensionless parameters. (This is 
the Fourier transform of the statement that G(p) ~ p~ 2 times a dimension¬ 
less function.) From this starting point, we can solve the Callan-Symanzik 
equation (13.10) by the method of Section 12.3, and find 


G(x) = TZid =2 h ({Pi( x )}) ' ex P 


-2 


1*1 

J rflog(|;c'|)7({p(;c')}) 


i/m 

where h is a dimensionless initial condition. The running coupling 
Pi obey the differential equation 


(13.12) 

constants 


d 

dlog(l/p\x\) Pt 


fidipj})- 


(13.13) 


We studied the solution to this equation in Section 12.5. We saw there 
that, for flows that come to the vicinity of the Wilson-Fisher fixed point, the 
dimensionless coefficient of the mass operator grows as one moves toward large 
distances, while the other dimensionless parameters become small. Let A* be 
the location of the fixed point. Then we can write more explicitly 


Pm = Pm{p\x\f 7 * 2(a * ) , 
Pi = i>i(p x ) 4 '. 


(13.14) 



13.1 Theory of Critical Exponents 443 


where A t > 0 for i ^ m. If the deviation of A from the fixed point is treated 
as one of the pi , by defining 

Pa = A - A*, (13.15) 

this parameter also decreases in importance as a power of |x|, as we demon¬ 
strated in Eq. (12.96). In the language of Section 12.1, all of the parameters 
Pi multiply irrelevant operators, except for p m , which multiplies a relevant 
operator. 

To approach the critical point, we adjust the parameters of the underlying 
theory so that, at some scale (1/p) near the atomic scale, p m <C 1. If p m 
is adjusted by tuning the temperature of the thermodynamic system, then 
p m ~ t. The critical scaling laws will be valid if there is a region of distance 
scales where p m remains small while the other p- can be neglected. The scaling 
laws can then be computed by evaluating the solution to the Callan-Symanzik 
equation with p m given by (13.14) and the other p, set equal to zero. The 
corrections to this approximation can be shown to be proportional to positive 
powers of t. 

In this approximation, we should evaluate the function q(A) in (13.12) at 
p A = 0, that is, at the fixed point. Using this value and the solution for p m , 
Eq. (13.12) becomes 

This equation implies the scaling laws (13.5) and (13.4): For the argument of 
h sufficiently small, G(x) obeys Eq. (13.5), with 

V = 27(A*). (13.17) 

At large distances, h must fall off exponentially, since this function is derived 
from a scalar field propagator. From the argument of h, we deduce that this 
exponential must be of the form 


exp[—|;c|(pi")], 


(13.18) 


where, as in (12.139), 


1 

2 - 7^ 2 (A*) 


(13.19) 


This is precisely the scaling law (13.2), (13.4), with the identification of v in 
terms of the anomalous dimension of the operator cf>' 2 . 


Exponents of Thermodynamic Functions 

The thermodynamic critical scaling laws can be derived in a similar way, by 
studying the scaling behavior of macroscopic thermodynamic variables. These 
are derived from the Gibbs free energy, or, in the language of quantum fields, 
from the effective potential of the scalar field theory. Since the effective poten¬ 
tial, and, more generally, the effective action, are constructed from correlation 
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functions, these quantities should satisfy Gallan-Symanzik equations. We will 
now construct those equations and then use them to identify the thermody¬ 
namic critical exponents. 

In Eq. (11.96), we showed that the effective action T depends on the 
classical field <p c i in such a way that the nth derivative of V with respect to 
<pci gives the one-particle-irreducible n-point function of the field theory. Thus 
we can reconstruct T from the 1PI functions by writing the Taylor series 

°°' 1 f 

r[d>ci] = i y — dx 1 • • ‘ dx n <t>c\(xi) ■ ■ -tpciixn) r (n) (a;-i,... ,x„), (13.20) 

2 n - J 

where the T ( "> are the 1PI amplitudes. 

To find the Callan-Symanzik equation satisfied by r[<^> c i], it is easiest to 
first work out the equation satisfied by r ( ”>. We begin by considering the 
irreducible three-point function r (3) . This function is defined as 

r<3 ^ {Pi = G (2 , (pi ) G ( o) {p2 ) q (:2) ( p3 ) G< ' 3 ’ fa ’ ■P 2 > P 3 ) '‘ (13 ' 21 ) 

Rescaling with factors Z(p,), we see that T (3) is related to the irreducible 
three-point function of bare fields by 

r (3) (pl,p2,P3) = ^(yU) +3/2 ro 3) (pi,P2,p3)- 

Similarly, the irreducible n-point function is related to the corresponding func¬ 
tion of bare fields by 

T (n) = Z(n) n/i T { 0 n] . (13.22) 

This relation is identical in form to the corresponding relation for the full 
Green’s functions, Eq. (12.35), except for the change of sign in the exponent. 
From this point, we can follow the logic used to derive the Callan-Symanzik 
equation for Green’s functions, Eq. (12.41); the only difference is that the nj 
term enters with the opposite sign. Thus we find 

+/3(A)^ - n 7 (A)]r (w >(fe};/i,A) = 0. (13.23) 

To convert this to an equation for the effective action, note that, on the 
right-hand side of Eq. (13.20), the function r ( ”> is accompanied by n powers 
of the classical field. Then Eq. (13.23), integrated with n powers of cp c \ and 
summed over n, is equivalent to the equation 

dx ^ c S(f> !(.t) ] r ^ cl ] ; /x ’ A ) = °' (13-24) 

The operator multiplying 7 (A) counts the number of powers of 0 C \ in each term 
of the Taylor expansion. By specializing Eq. (13.24) to the case of constant 
(pd, we find the Callan-Symanzik equation for V e s: 

V^ +m -k ~ 7^cl^]le ff (^,/i, A) = 0. 


( 13 . 25 ) 
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To apply Eq. (13.25) to the problem of critical exponents, we first convert 
this equation to the notation of statistical mechanics by replacing (f> c \ with the 
magnetization M, the conjugate source J by H, and the effective potential 
V e ff by the Gibbs free energy G(M). At the same time, we will generalize A 
to the full set of couplings pi. Then (13.25) takes the form 

[>% + T, - i M m\ G(M, ft, W ) = 0. (13.26) 


Now let us find the solution to this equation. As before, we begin from 
a statement of dimensional analysis. In d dimensions, the effective potential 
has mass dimension d, and a scalar field has mass dimension (d — 2)/2. Thus 

G (M,p, { P i}) = \l- d id 1 u(Mp «' W 2 . { Pi }), (13.27) 


where g is a new dimensionless function. Inserting (13.27) into (13.26), we see 
that g satisfies 



(“r : ') u 77 T 7 d-^- 2 -'g(Mp \\p;}) (I. (13.28) 


that is, 


d 

M dM 


y' jA __j_ -Icly _-j A _ 

4 - (d - 2 + 2 7 ) d Pi (d - 2)(d - 2 + 2 7 ) J 9 ~ 


(13.29) 


Solving this equation, we find 


G(M) = M 2d ^ d ~ 2) h({pj(M)}) 


M 


x exp 


J dlog(M')- 


4d 7 


,( d - 2)/2 


(d — 2)(d — 2 + 2 7 ) 
where the running coupling constants p t obey 


(mm j, 

(13.30) 


d _ 

d log M /9,; d — 2 + 2 7 ({p,}) 


(13.31) 


As in our discussion of the spin correlation function, we specialize to the 
critical region by assuming that we are on a renormalization group flow that 
passes close to the Wilson-Fisher fixed point. We again ignore the effects of 
irrelevant operators. Then we should set 


p i = 0 for i m , 


(13.32) 


with p m ~ t. In this approximation, the Gibbs free energy takes the form 
G(M.I) = M 2d / {d -V ■ [Up ^'d 2.-.-(A.j) 

• h(t{M p~ (d ~ 2)/2 )~ 2(2 ~ 7 *^ K))/{d ~ 2+2l(K)) ), 


(13.33) 
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where h is a smooth initial condition. 

To simplify the form of the exponents in this expression, we anticipate 
some of the results below and replace 


d- 2 + 2 7 (A*) 

2(2-7 0 2 (A,))’ 

2d d + 2 — 2y(A,) 

d — 2 + 2y(A*) “ c/-2 + 2 7 (A*)' 


(13.34) 


We must demonstrate that these new exponents indeed correspond to the 
ones we have defined in Eqs. (13.7) and (13.8). With these replacements (and 
ignoring the dependence on p from here on), we find for G the scaling formula 


G (M,t) = M 1+s h(tM~ 1 ^ 3 ), (13.35) 


where h has a smooth limit as t —)■ 0. An equivalent way to represent this 
formula is 


G (M,t) =t l3(1+s) f{Mt~ 3 ). (13.36) 


The scaling laws for thermodynamic quantities follow immediately from 
these relations. Along the line t = 0, we find from (13.35) that 

BO 

H= dM = h{0)MS ’ {13 ' 37) 

which is precisely (13.8). Below the criticial temperature, we find the nonzero 
value of the magnetization by minimizing G with respect to M. In the scaling 
region, this minimum occurs at the minimum mo of the function f(m) in 
(13.36). This leads to relation (13.7), in the form 

Mt~ 0 = m o . (13.38) 

If we work above Tq and in zero field, the minimum of / must occur at M = 0. 
Then 

G(/| - / ;M • |5; . (13.39) 


To compute the specific heat, we differentiate twice with respect to tempera¬ 
ture; this gives the scaling law (13.6), with 

2 — a = 0(1 + 6) = - - d —-. (13.40) 

2 - 7<i 2 (W) 

Finally, we must construct the scaling law for the magnetic susceptibility. 
From (13.36), the scaling law for H at nonzero t is 

80 

H = m = tPS f'( Mt ~ p )- ( 13 - 41 ) 

The inverse of this relation is the scaling law 

M = t^ciHr 36 ). 


(13.42) 
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The magnetic susceptibility at zero field is then 

*-Q,-«*** 

Thus, we confirm Eq. (13.9), with the identification 

2(1 -7(A.)) 


7 = (S — 1)3 = 


2 - 70 2 (A*) 


(13.43) 


(13.44) 


We have now found explicit expressions for all of the various critical ex¬ 
ponents in terms of the Callan-Symanzik functions. As the dimensionality cl 
approaches 4 from below, the fixed point A* tends to zero. Then the six critical 
exponents approach the values that they would attain in simple dimensional 
analysis: 


>1 = 0 ; v = |; a = 0 ; 3 =h 

7 = 1 ; <5 = 3 . 


(13.45) 


It is no surprise that the values of rj, v and 13 given in (13.45) are those that we 
derived in Chapter 8 from the the Landau theory of critical phenomena. The 
other values can similarly be shown to follow from Landau theory. The renor¬ 
malization group analysis tells us how to systematically correct the predictions 
of Landau theory to take proper account of the large-scale fluctuations of the 
spin field. 

Notice that all of the exponents associated with thermodynamic quan¬ 
tities are constructed from the same ingredients as the exponents associated 
with the correlation function. From the field theory viewpoint, this is obvious, 
since all of the scaling laws in the field theory must ultimately follow from the 
anomalous dimensions of the operators <p(x) and < fr{x ), which are precisely 
y(A*) and 7 ^ 2 (A*). This result, however, has an interesting experimental con¬ 
sequence: It implies model-independent relations among critical exponents. 
For example, in any system with a critical point, this theory predicts 

a = 2 — dv, 3 = \(d — 2 + (13.46) 

These relations test the general framework of identifying a critical point with 
the fixed point of a renormalization group flow. 

In addition, the field theoretic approach to critical phenomena predicts 
that critical exponents are universal in the sense that they take the same 
values in condensed matter systems that approach the same scalar field fixed 
point in the limit T —>■ Tr. 


Values of the Critical Exponents 

Finally, scalar field theory actually predicts the values of y(A*) and 7 ^ 2 (A*), 
either from the expansion in powers of e = 4 — d described in Section 12.5 or 
by direct expansion of the 3 and 7 functions in powers of A. We can use these 
expressions to generate quantitative predictions for the critical exponents. 
We gave an example of such a prediction at the end of Section 12.5, when we 
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presented in Eq. (12.143) the first two terms of an expansion for v. We now 
return to this question to give field-theoretic predictions for all of the critical 
exponents. 

In our discussion at the end of Section 12.5, we remarked that magnets 
with different numbers of fluctuating spin components are observed to have 
different values for the critical exponents. An optimistic hypothesis would 
be that any thermodynamic system with N fluctuating spin components, or, 
more generally, N fluctuating thermodynamic variables at the critical point, 
would be described by the same fixed point field theory with N scalar fields. 
A natural candidate for this fixed point would be the Wilson-Fisher fixed 
point of the 0(A T )-symmetric cp 4 theory discussed in Chapter 11. We will now 
describe the computation of critical exponents for general values of N in this 
theory. 

As a first step, we should compute the values of the functions j3( A), 7 (A), 
and 792 (A) in four dimensions. This computation parallels the analysis done 
in Chapter 12 for ordinary < i > 4 theory, so we will only indicate the changes 
that need to be made for this case. Just as in ordinary <p 4 theory, the prop¬ 
agator of the massless 0(lV)-symmetric theory receives no field strength cor¬ 
rections in one-loop order, and so the one-loop term in 7 (A) again vanishes. 
In Problem 13.2, we compute the leading, two-loop, contribution to 7 (A) in 
0 (iY)-symmetric <p 4 theory: 

fMW + a)^+ <***». (13.47) 

The one-loop contribution to the (3 function in < i> 4 theory is derived from the 
one-loop vertex counterterm 6\, given in Eq. (12.44). For the 0(JV)-symmetric 
case, we computed the divergent part of the corresponding vertex counterterm 
in Section 11.2; from Eq. (11.22), 

A 2 r(2 —-) 

** = + 8) (M 2 ) 2 ~ rf / 2 + finlte - (13 ' 48) 

Following the logic to Eq. (12.46), or using Eq. (12.54), we find 

/ ?= ( AT + 8 )^ + 0 ( A 3 ). (13.49) 

This reduces to the j3 function of ( ft 4 theory if we set N = 1 and replace 
A —> A/ 6 , as indicated below Eq. (11.5). Finally, to compute 7 ^ 2 , we must 
repeat the computation done at the end of Section 12.4. If we consider, instead 
of (12.113), the Green’s function ^4> l {p)(fP{q)(jr(k)), and replace the vertex of 
( ft 4 theory by the four-point vertex following from the Lagrangian (11.5), the 
factor (—i\) in the first line of (12.114) is replaced by 

{-2iX)[6 ij 6 ke + S ik S je + S u S jk ] ■ S ke = —2i\(N + 2 )S ij . 
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Next, we consider the same theory in (4 — e) dimensions. The 3 function 
now becomes ^ 

3=-e A + (JV + 8 )A_, (13.51) 


so there is a Wilson Fisher fixed point at 

87r 2 e 


A* = 


JV + 8 


(13.52) 


At this fixed point, we find 


7(A*) 


N + 2 2 

4(JV + 8) 2 6 


+ • * ‘ , 


7<+(A*) 


N + 2 
A' + 8 




(13.53) 


/.From these two results, we can work out predictions for the whole set of 
critical exponents to order e. As an example, inserting (13.53) into (13.19), 
we find 


v 


-l 


= 2 - 


N + 2 
JV + 8 


e + 0 (e“), 


(13.54) 


as claimed at the end of Section 12.5. 

In our discussion in Chapter 12, we claimed that the predictions of crit¬ 
ical exponents are in rough agreement with experimental data. However, by 
computing to higher order, one can obtain a much more precise comparison 
of theory and experiment. The e expansion of critical exponents has now been 
worked out through order e 5 . More impressively, the A expansion for criti¬ 
cal exponents in d = 3 has been worked out through order A 9 . By summing 
this perturbation series, it is possible to obtain very precise estimates of the 
anomalous dimensions y(A*) and 7 ^ 2 (A*) and, through them, precise predic¬ 
tions for the critical exponents. 

A comparison of these values to direct determinations of the critical expo¬ 
nents is given in Table 13.1. The column labeled ‘QFT’ gives values of critical 
exponents obtained by anomalous dimension calculations using 0 A pertur¬ 
bation theory in three dimensions. The column labeled ‘Experiment’ lists a 
selection of experimental determinations of the critical exponents in a variety 
of systems. These include the liquid-gas critical point in Xe, CO 2 , and other 
fluids, the critical point in binary fluid mixtures with liquid-liquid phase sepa¬ 
ration, the order-disorder transition in the atomic arrangement of the Cu-Zn 
alloy /1-brass, the superfluid transition in 4 He, and the order-disorder transi¬ 
tions in ferromagnets (EuO, EuS, Ni) and antiferromagnets (RbMnF 3 ). The 
agreement between experimental determinations of the exponents in different 
systems is a direct test of universality. For the case of systems with a single 
order parameter (N = 1), there is a remarkable diversity of physical systems 
that are characterized by the same critical exponents. 

The column labeled ‘Lattice’ contains estimates of critical exponents in 
abstract lattice statistical mechanical models. For these simplified models, the 
statistical mechanical partition function can be calculated in an expansion for 
large temperature. With some effort, these expansions can be carried out to 
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Table 13.1. Values of Critical Exponents 
for Three-Dimensional Statistical Systems 


Exponent 

Landau 

QFT 

Lattice 

Experiment 

N = 1 Systems: 





7 

1.0 

1.241 (2) 

1.239 (3) 

1.240 (7) 

binary liquid 





1.22 (3) 

liquid-gas 





1.24 (2) 

/1-brass 

V 

0.5 

0.630 (2) 

0.631 (3) 

0.625 (5) 

binary liquid 





0.65 (2) 

/1-brass 

a 

0.0 

0.110 (5) 

0.103 (6) 

0.113 (5) 

binary liquid 





0.12 ( 2 ) 

liquid-gas 

3 

0.5 

0.325 (2) 

0.329 (9) 

0.325 (5) 

binary liquid 





0.34 (1) 

liquid-gas 

V 

0.0 

0.032 (3) 

0.027(5) 

0.016 (7) 

binary liquid 





0.04 (2) 

/1-brass 

N = 2 Systems: 





7 

1.0 

1.316 (3) 

1.32 (1) 



V 

0.5 

0.670 (3) 

0.674 (6) 

0.672 (1) 

superfluid 4 He 

a 

0.0 - 

-0.007 (6) 

0.01 (3) 

-0.013 (3) 

superfluid 4 He 

N = 3 Systems: 





7 

1.0 

1.386 (4) 

1.40 (3) 

1.40 (3) 

EuO, EuS 





1.33 (3) 

Ni 





1.40 (3) 

RbMnF3 

V 

0.5 

0.705 (3) 

0.711 (8) 

0.70 (2) 

EuO, EuS 





0.724 (8) 

RbMnF3 

a 

0.0 - 

-0.115 (9) 

-0.09 (6) 

- 0.011 ( 2 ) 

Ni 

3 

0.5 

0.365 (3) 

0.37 (5) 

0.37 (2) 

EuO, EuS 





0.348 (5) 

Ni 





0.316 (8) 

RbMnFs 

V 

0.0 

0.033 (4) 

0.041 (14) 




The values of critical exponents in the column ‘QFT’ are obtained by resumming 
the perturbation series for anomalous dimensions at the Wilson-Fislier fixed point in 
O(N )-symmetric </> 4 theory in three dimensions. The values in the column ‘Lattice’ 
are based on analysis of high-temperature series expansions for lattice statistical me¬ 
chanical models. The values in the column ‘Experiment’ are taken from experiments 
on critical points in the systems described. In all cases, the numbers in parentheses are 
the standard errors in the last displayed digits. This table is based on J. C. Le Guil- 
lou and J. Zinn-Justin, Phvs. Rev. B21, 3976 (1980), with some values updated from 
J. Zinn-Justin (1993), Chapter 27. A full set of references for the last two columns can 
be found in these sources. 



13.2 Critical Behavior in Four Dimensions 


451 


15 terms or more. By resumming these series, one can obtain direct theoretical 
estimates of the critical exponents, with an accuracy comparable to that of 
the best experiments. The comparison between these values and experiment 
tests the identification of experimental systems with the simple Hamiltonians 
that were the starting point for our renormalization group analysis. 

The agreement of all three types of determinations of the critical expo¬ 
nents presents an impressive picture. The picture is certainly not perfect, and 
a careful inspection of Table 13.1 reveals some significant discrepancies. But, 
in general, the evidence is compelling that quantum field theory provides the 
basic explanation for the thermodynamic critical behavior of a broad range 
of physical systems. 

13.2 Critical Behavior in Four Dimensions 

Now that we have discussed the general theory of critical exponents for d < 4, 
let us concentrate some attention on the case d = 4. This case obviously has 
special interest for the applications of quantum field theory to elementary 
particle physics. In addition, we now know that d = 4 lies on a boundary at 
which the Wilson-Fisher fixed point disappears. We would like to understand 
the special behavior of quantum field theory predictions at this boundary. 

The most obvious difference between d < 4 and d = 4 is that, while in the 
former case the deviation of A from the fixed point multiplies an irrelevant 
operator, in the case d = 4, A multiplies a marginal operator. We have seen in 
Eq. (12.82) that, at small momenta or large distances, the running value of A 
still approaches its fixed point, now located at A = 0. However, this approach 
is described by a much slower function, not a power but only a logarithm of 
the distance scale. Thus it is normally not correct to ignore the deviation of 
A from the fixed point. Including this effect, we find additional logarithmic 
terms, analogous to the dependence of correlation functions on logp that we 
already know characterizes a renormalizable field theory. 

To give a nontrivial illustration of this logarithmic dependence, we return 
to a problem that we postponed at the end of Chapter 11. In Eq. (11.81), 
we obtained the expression for the effective potential of (f > 4 theory to second 
order in A, in the limit of vanishing mass parameter: 

leff = ^4 [A + ^ ((IV + 8)(log(A4/M 2 ) - §) + 9log3)]. (13.55) 

(Note that we now return to our standard notation, in which M is the renor¬ 
malization scale and // is a mass parameter.) This expression seemed to have 
a minimum for very small values of <f> c \, but only at values so small that 

|Alog(A4/Af 2 )| ~ 1. (13.56) 

Since, at the nth order of perturbation theory, one finds n powers of this loga¬ 
rithm, Eq. (13.56) implies that the higher-order terms in A are not necessarily 
negligible. What we need is a technique that sums these terms. 
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This summation is provided by the Callan-Symanzik equation. From 
(13.24) or (13.25), the Callan-Symanzik equation for the effective potential 
in the massless limit of four-dimensional <j> 4 theory is 

l M m+- *'a^] "• A >=< 13 - 57 > 

As before, we can solve for V e s by combining this equation with the predictions 
of dimensional analysis. In d = 4, 

Undo,,. 1/. A! r(o.:/!/. Aj. 

Then v satisfies 

[, 9 P d 4j ] _ 

b 'TTv; l+ 7 <9A + l + 7 -T 

This equation for v can be solved by our standard methods, to give 


¥>d 


v(<j>/M, X)=v 0 (X)exp /dlog^ 1+ (A(M , 

(13.60) 

M 


where A satisfies 


d - /1(A) 

d\og(<f) c \/M)' 1 + 7 (A) 

(13.61) 


However, since we are working only to the order of the leading loop correc¬ 
tions, and since 7 (A) is zero to this order, we can ignore the exponential in 

(13.60) . In addition, we can ignore the denominator on the right-hand side of 

(13.61) , so that this equation reduces to the more standard form of the equa¬ 
tion (12.73) for the running coupling constant. Thus, using the leading-order 
Callan-Symanzik function, we find 

V eB (M = %(A(^ c i))4- (13-62) 

The function vq in (13.62) is not determined by the Callan-Symanzik 
equation. To find this function, we compare (13.62) to the result (13.55) that 
we obtained from our explicit one-loop evaluation of the effective potential. 
The precise constraint is the following: After choosing the function vo(X), 
substitute for A the solution (12.82) to the renormalization group equation, 

~ X{<Pcl) = 1 (A/8.T- )i A -r-S! logio,.:/!/ )' {13 - 63) 

Then expand the result in powers of A and drop terms of order A 3 and higher. 
If vo is chosen correctly, the result should agree with (13.55). Applying this 
criterion, we find the following result for the effective potential: 

1 - A 2 

V eS (M = i^ci[A+ ^((iV+8)(logA- !) +9log3)], (13.64) 

where A is given by (13.63). 


(13.58) 

(13.59) 
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The error in Eq. (13.64) comes in the determination of vq as a power 
series in A. Thus this error is of order A 3 . As 0 C \ —> 0, A —» 0, and so the 
representation (13.64) becomes more and more accurate. Thus this formula 
successfully sums the powers of the dangerous logarithm (13.56). Viewed as 
a function of (j> c i, (13.64) has its minimum at 4> c \ = 0. Thus the apparent 
symmetry-breaking minimum of (13.55) is indeed an artifact of the incomplete 
perturbation expansion and disappears in a more complete treatment. This 
resolves the question that we raised in Section 11.4. We should note that, 
in more complicated examples, an apparent symmetry-breaking minimum of 
the effective potential found in the one-loop order of perturbation theory can 
survive a renormalization-group analysis. An example is given in the Final 
Project for Part II. 

The procedure we have followed in this argument is called the renor¬ 
malization group improvement of perturbation theory. The technique can be 
applied equally well to the computation of correlation functions and other pre¬ 
dictions of Feynman diagram perturbation theory: One compares the solution 
of the Callan-Symanzik equation to the result of a straightforward perturba¬ 
tion theory computation to the same order in the coupling constant, choosing 
the undetermined function in the renormalization group solution in such a 
way as to reproduce the perturbation theory result. In this way, one finds a 
more compact formula in which large logarithms such as those in (13.56) are 
resummed into running coupling constants. This resummation produces the 
dependence of correlation functions on the logarithm of the mass scale that 
characterizes a field theory with a marginal or renormalizable perturbation. 

In the case of (f> 4 theory, the running coupling constant goes to zero at 
small momenta and becomes large at large momenta. Since the error term 
in improved perturbation theory is a power of A, the improved perturbation 
theory becomes accurate at small momenta but goes out of control at large 
momenta. This accords with our physical intuition: We would expect pertur¬ 
bation theory to be accurate only when the running coupling constant stays 
small. 

In an asymptotically free theory, where the running coupling constant be¬ 
comes small at large momenta, we can find accurate expressions for correlation 
functions at large momenta using renormalization-group-improved perturba¬ 
tion theory. In Chapters 17 and 18 we will use this idea as our major tool in 
analyzing the short-distance behavior of the strong interactions. 
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13.3 The Nonlinear Sigma Model 

To complete our study of scalar field theory, we will discuss a nonlinear theory 
of scalar fields, whose structure is very different from that of <f > 4 theory. This 
theory, called the nonlinear sigma model, was first proposed as an alternative 
description of spontaneous symmetry breaking. It will be interesting to us for 
three reasons. First, it provides a simple explicit example of an asymptotically 
free theory. Second, it will give us a second dimensional expansion with which 
we can study the Wilson-Fisher fixed point. Then we can see where the Wilson- 
Fisher fixed point goes in the space of Lagrangians for dimensions d, well below 
4. Finally, we will show that the nonlinear sigma model is exactly solvable in 
a limit that is different from the standard weak-coupling limit. This solution 
will give us further insight into the dependence of symmetry breaking on 
spacetime dimensionality. 

The d, = 2 Nonlinear Sigma Model 

We begin our study in two dimensions. In d = 2, a scalar field is dimensionless; 
thus, any theory of scalar fields (p l with a Lagrangian of the form 

£ (13.65) 

has dimensionless couplings and so is renormalizable. Since any function 
/({d>*}) leads to a renormalizable theory, this class of scalar field theories 
contains an infinite number of marginal parameters. To restrict these possible 
parameters, we must impose some symmetries on the theory. 

A simple choice is to take the scalar fields (f>‘ to form an IV-component 
unit vector field n’{x), constrained to satisfy 

N 

^|n ,: (.r)| 2 = 1. (13.66) 

i=1 

If we insist that the field theory has O(N) symmetry, the function / in (13.65) 
can depend only on the invariant length of ii(x), which is constrained by 
(13.66). Thus, the most general possible choice for / is a constant. Similarly, 
the only possible nonderivative interaction g{{n 1 }) that one might add to 
(13.65) is a constant, and this would have no effect on the Green’s functions 
of n. With these restrictions, the most general Lagrangian one can build from 
n(x) with two derivatives and O(N) symmetry is 

£ = ^>M 2 - (13-67) 

This theory has a straightforward physical interpretation. It is a phe¬ 
nomenological description of a system with O(N) symmetry spontaneously 
broken by the vacuum expectation value of a field that transforms as a vector 
of O(N). Consider, for example, the situation in A r -component ( j ) 4 theory in 
its spontaneously broken phase. The field <p' acquires a vacuum expectation 
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value, which we can write in terms of a magnitude and a direction parame¬ 
terized by a unit vector 

(<t> i )=<t>on i \x). (13.68) 


The fluctuations of 4>o correspond to a massive field, the field called a in Chap¬ 
ter 11. The fluctuations of the direction of the unit vector n(x) correspond to 
the N — 1 Goldstone bosons. Notice that n has N components subject to the 
one constraint (13.66), and so contains N — 1 degrees of freedom. Formally, 
the nonlinear sigma model is the limit of <f> 4 theory as the mass of the a field 
is taken to infinity while <j>o is held constant. 

Despite this suggestive connection, we will first analyze the nonlinear 
sigma model on its own footing as an independent quantum field theory. It 
is convenient to solve the constraint and parametrize n by N — 1 Goldstone 
boson fields ir k : 

n’ = (ir 1 , ■ ■ ■ ,ir N ~ 1 ,a), (13.69) 

where, by definition, 

a = (1 -tt) 1/2 . (13.70) 


The configuration ir k = 0 corresponds to a uniform state of spontaneous 
symmetry breaking, oriented in the A r direction. The representation (13.69) 
implies that 


la Ml 2 _ I* -I 2 , )- 

\ d » n I _ I + i-7T 2 


(13.71) 


Then the Lagrangian (13.67) takes the form 




(n • <9,,7f) 2 ~ 
1 — 7T 2 


(13.72) 


Notice that there is no mass term for the field n, as required by Goldstone’s 
theorem. 

The perturbation theory for the ir k field can be read off straightforwardly 
by expanding the Lagrangian in powers of ir k : 

£= + ' d ^ 2 + " ' ■ (13.73) 

This leads to the Feynman rules shown in Fig. 13.1, plus additional vertices 
with all even numbers of ir k fields. Since the Lagrangian (13.67) is the most 
general 0(iV)-symmetric Lagrangian with dimensionless coefficients that can 
be built out of these fields, the theory can be made finite by renormalization of 
the coupling constant g and 0(lV)-symmetric rescaling of the fields ir k and a. 
In renormalized perturbation theory, there are divergences and counterterms 



456 Chapter 13 Critical Exponents and Scalar Field Theory 


Figure 13.1. Feynman rules for the nonlinear sigma model. 

for each possible 2 n- 7 r vertex; however, these counterterms are all related by 
the basic requirement that the bare Lagrangian preserve the O(N) symmetry. 

We now compute the Callan-Symanzik functions for this theory. Since 
the theory is renormalizable, its Green’s functions obey the Callan-Symanzik 
equation for some functions /?, 7 . Explicitly, 

\M + ,d( 9 )§- g + n^y(g)] G<"> = 0 , (13.74) 

where G (n) is a Green’s function of n fields n k or a. To identify the j3 and 7 
functions, to the leading order in perturbation theory, we compute two simple 
Green’s functions to one-loop order and then see what forms are necessary if 
the Callan-Symanzik equation is to be satisfied. 

The first Green’s function we consider is 

G (1) = (a(x)) . (13.75) 

Expanding the definition (13.70), we find 

(<j(0)) = 1 — | ( 7 I ' 2 (0)) + • • • = 1 + . (13.76) 


To evaluate this formula, we use the propagator of Fig. 13.1 to compute 

bW(o» = = / * 1377 * 


We have added a small mass g, as an infrared cutoff. Then 


<7r*(0)7r^(0)) 


r rq-f) ke 

(4n) d / 2 (v?y~ d/2 ' 


(13.78) 


Using this result in our expression for {a) and then subtracting at the mo¬ 
mentum scale M, we find 


(a) = l-hN-l) 


(47r) d / : 


•id i) 


( /y 2 ) 1 -d /2 ( M 2 )l-d /2 


+ 0(g 4 ) 


J _ flHzll ] og Ml + 0(g‘). 

87 r 


d—>2 


(13.79) 



13.3 The Nonlinear Sigma Model 457 


This expression satisfies the Callan-Symanzik equation to order g 2 only if 

g 2 (N- 1) 


7 (g) = 


4-k 


+ V(g 4 )- 


(13.80) 


Next, consider the ir k two-point function, 

(n k (p)Tr e (-p)) = 


= Ks kl + K(-m kt )K + 

p- p- p- 


(13.81) 


In evaluating TT Af from the Feynman rules in Fig. 13.1, we again encounter 
the integral (13.77), and also the integral 

d d k ig 2 k 2 


(d^ k (0)d^ e (0)) = J 


rkl 


(27r) d k 2 — p 2 

9 d'p/ d\ 

(47r) d / 2 (p 2 )~ d / 2 


(13.82) 


■M 


This formula has no pole at d = 0, and for d > 0 it is proportional to a positive 
power of p 1 : hence, we can set this contraction to zero. Then 


n kt (p) = -S ke p 2 - 


1 


r(i-f) 


( 47r )d/2 (// 2)l H/2- 

Subtracting at M as above and taking the limit <i —1 2, we find 

" 2 "” 2 ( . 2 1 M 2 \ ig 2 

r [+ip-— log ' J 

p- p~ 


(n k (p)n ( (-p)) = %6 k( + ' l -£(+ip 2 ±- log^) 
p- p- \ 47r p- ) 


+ 


p- J p~ 


+<*»)■ 


(13.83) 


(13.84) 


Applying the Callan-Symanzik equation to this result gives 

I U 7TT7 f U9 ’JJ7 + 27 M = °> 


dg 

= + i(9) ' 29 +2ff27(5) ] ■ 


(13.85) 


Inserting the result (13.80) for 7 (g), we find 


/%) = -(7V-2)g-+0(<f). 


(13.86) 


At N = 2 precisely, the beta function vanishes. This is not an accident 
but rather is a nontrivial check of our calculation. For N = 2, we can make 
the change of variables 7T 1 = sin#; then a = cos#, and the Lagrangian takes 
the form 


(13.87) 
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This is a free field theory for the field 8(x), so it can have no renormalization 
group flow. 

For N > 2, the ft function is negative-. This theory is asymptotically free. 
The running coupling constant g becomes small at small distances and grows 
large at large distances. 

In quantum electrodynamics, we found an appealing physical picture for 
the sign of the coupling constant evolution. As we discussed in Section 7.5, the 
process of virtual pair creation makes the vacuum a dielectric medium, which 
screens electric charge. One would therefore expect the effective Coulomb 
interaction of charge to decrease at large distances and increase at small dis¬ 
tances. It is easy to imagine that a similar screening phenomenon might occur 
in any quantum field theory. Thus, it is surprising that, in this theory, we 
have found by explicit calculation that the coupling constant evolution has 
the opposite sign. What is the physical explanation for this? 

In fact, the original derivation of the asymptotic freedom of the nonlin¬ 
ear sigma model, due to Polyakov, 1 ' gave a clear physical argument for the 
sign of the evolution. Now that we have derived the ft function by the au¬ 
tomatic method of the Callan-Symanzik equation, let us review Polyakov’s 
more physical derivation. 

Polyakov analyzed the nonlinear sigma model using Wilson’s momentum¬ 
slicing technique, which we discussed in Section 12.1. Consider, then, the 
nonlinear sigma model defined with a momentum cutoff in place of the di¬ 
mensional regulator. As in Section 12.1, we work in Euclidean space with 
initial cutoff A. 

The original integration variables are the Fourier components of the unit 
vector field n’(x). We wish to integrate out of the functional integral those 
Fourier components corresponding to momenta k in the range bA < \k\ < A. If 
the remaining components are Fourier-transformed back to coordinate space, 
they describe a coarse-grained average of the original unit vector field. This 
averaged field can be rescaled so that it is again a unit vector at each point. 
Call this averaged and rescaled field n\ Then we can write the relation of n® 
and n® as follows: 


Af—1 

n‘(x) = n’(x)( 1 - (p 2 ) 1/J + ^ 4> a (x)e\(x). (13.88) 

a =1 


In this equation, the vectors e a (x) form a basis of unit vectors orthogonal 
to n(x). In Polyakov’s picture, n(x) and the e a (x) are slowly varying. On 
the other hand, the coefficients (j) a {x ) contain only Fourier components in the 
range b A < \k\ < A. These are the variables we integrate over to achieve the 
renormalization group transformation. 


U. M. Polyakov, Phvs. Lett. 59B, 79 (1975). 
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To set up the integral over <f> a , we first work out 

= 3^(1 - 0 2 ) 1 / 2 -( (| °^° (13.89) 

By the definition of n, e a , these vectors satisfy 

|n| 2 = 1; n ■ e a = 0. (13.90) 

Taking the derivative of these identities, we find 

n ■ d^n = 0; n ■ d^e a + d„n ■ e a = 0. (13.91) 

Using the identities in (13.90) and (13.91), we can compute the Lagrangian 
of the nonlinear sigma model through terms quadratic in the <j> a : 

£ = ^2 |9„n*| 2 = [|<VV| 2 (1 - cjr) + (5„4) 2 + 2ia,<)"<>,, )i>:, ■ d M e b ) 

+ <)>'<>., <),,u ■ e a + o.,<»,d“<:, ■ d^e b H- j. (13.92) 

We will consider the second term of (13.92) to be the zeroth-order La¬ 
grangian for (p a . Thus, 

A = ^w(AA) 2 , (13-93) 

which gives the propagator 

{<t>a(p)4>b(-p)) = ^S ab , (13.94) 

p- 

restricted to the momentum region bA < \p\ < A. This propagator can be used 
to integrate the remaining terms of the Lagrangian over the <f> a ■ Borrowing 
the integrals from the derivation of (13.84), we can set 

<&(0)a„&(0)> = maiWvMO)) = 0 (13.95) 

and 

2 *2 

{4>a(0)4> b (0)) = S ab ^ log (13.96) 

Then, after the integral over <f>, the new Lagrangian is given approximately 

by 

£eff = 7^7 [|<V1| 2 (1 - (0 2 )) + {<t>a<t>b) d fl e a ■ d^eh + 0(g 4 )] , (13.97) 

where the expectation values of 4> a are given by (13.96). 

To simplify this further, we must simplify the structure {d fl e a )' 2 that ap¬ 
pears in the second term of (13.97). Introduce a complete basis of vectors: 

(d fl e a ) 2 = (fi ■ d^e a f + (e c • 3„e a ) 2 . (13.98) 

The second term on the right is a new structure, associated with the torsion 
of the coordinate system for e a ; it turns out to correspond to an irrelevant 
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operator induced by the renormalization procedure. The first term, however, 
can be put into a familiar form by using the two identities (13.91): 

(n • <9„e a ) 2 = (e a ■ d^nf = (d^fi) 2 . (13.99) 


Then 


** = 3? l) i k » ? + h log b) + '' ) 


j(* 


1 






(13.100) 


The quantity in parentheses is the square of a running coupling constant. To 
the order of our calculation, this quantity satisfies 


Si6 5 = - (N -' 2 >F' 


(13.101) 


in agreement with (13.86). 

In this calculation, the sign of the coupling constant renormalization 
comes from the fact that the effective length of the unit vector n is reduced by 
averaging over short-wavelength fluctuations. This lowers the effective action 
associated with a configuration in which the direction of ft changes over a dis¬ 
placement A;r (see Fig. 13.2). Looking back at (13.67), we see that a decrease 
of the magnitude of C for the same configuration of ft can be interpreted as 
an increase of the effective coupling. Thus the nonlinear sigma model is more 
strongly coupled, or, in terms of the physical configuration of the ft field, more 
disordered, at large distances. 

Our calculation implies that, if any two-dimensional statistical system 
apparently has spontaneously broken symmetry and Goldstone bosons, then, 
at large distances, the ordering disappears. This is an unexpected conclusion. 
However, this conclusion is in accord with a theorem proved by Mermin and 
Wagner* that a two-dimensional system with a continuous symmetry cannot 
support an ordered state in which a symmetry-breaking field has a nonzero 
vacuum expectation value. This theorem applies to the case N = 2 as well as 
to N >2. We have motivated this theorem in Problem 11.1. 


The Nonlinear Sigma Model for 2 < d < 4 


We now extend the results of this analysis to dimensions cl > 2. In general d, 
we will continue to define the action of the nonlinear sigma model by 



(13.102) 


where ft is still dimensionless, since it obeys the constraint |n| 2 = 1. Thus 
g has the dimensions (mass) (2_d) / 2 . We define a dimensionless coupling by 


+N. D. Mermin and H. Wagner, Phys. Rev. Lett. 17, 1133 (1966). 
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Figure 13.2. Averaging of the direction of ft, and its interpretation as an 
increase of the running coupling constant. 

writing 

T = g 2 M d ~ 2 , (13.103) 

just as we did in (12.122). If (13.102) is viewed as the Boltzmann weight of 
a partition function, then T is a dimensionless variable proportional to the 
temperature. 

^From (13.103), we can find the j3 function for T in d dimensions, in 
analogy to Eq. (12.131): 

f3(T) = (d-2)T + 2gf3 {2 \g), (13.104) 


where the factor of 2 g in the second term comes from the definition T ~ g 2 . 
Since n is dimensionless, the 7 function is unchanged from the two-dimensional 
result when expressed in terms of dimensionless couplings. Thus, in d = 2 + e, 

P(T) = +eT — (A T —2)^—; 

^ 2lT (13.105) 

7 (T) = (iV—!)-. 


Notice that the j3 function for T has a nontrivial zero, which approaches 
T = 0 as e —» 0. This zero is located at 


T* 


27TC 

N - 2 


(13.106) 


The form of the (3 function is sketched in Fig. 13.3. In contrast to the Wilson- 
Fisher zero in d = 4 — e, discussed in Section 12.5, this is an ultraviolet-stable 
fixed point. The flows to the infrared go out from this fixed point. Since T is 
proportional to the temperature of the corresponding statistical system, t —> 0 
is the state of complete order, while t 1 is the state of complete disorder. 
This agrees with the intuition that accompanied Polyakov’s derivation of the 
/3 function. The fixed point T* corresponds to the critical temperature. Thus, 
the critical temperature tends to zero as d —> 2, in accord with the Mermin- 
Wagner theorem. 
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Figure 13.3. The form of the /3 function in the nonlinear sigma model for 
d > 2. 


We can now compute the critical exponents of the nonlinear sigma model 
in an expansion in e = d — 2. The exponent p is given straightforwardly by 

V = 27 (T.) = (13.107) 

To find the second exponent v, we need to identify the relevant perturbation 
that corresponds to the renormalization group flow away from the fixed point 
for T ^ T c . This is just the deviation of T from T*: 

p T = T — T t . (13.108) 


From the renormalization group equation for the running coupling constant, 
we find that the running px obeys 


d 

d\ogp^ T 


pm 


T=T, 


■ Pt . 


(13.109) 


The quantity in brackets is negative. As in Eqs. (12.134) and (12.137), we can 
identify this quantity with ( — 1/v)-. At a momentum p <7 M, 

Pt(p) = Pt (|j) 7 ; (13.110) 


thus p(p) becomes of order 1 at a momentum that is the inverse of £ ~ 
(T— T*) - ", as required. Using the explicit form of the f3 function from (13.105), 


we find 


(13.111) 


independent of N to this order in e. (Of course, these results apply only for 
N > 3.) The thermodynamic critical exponents can be found from (13.107) 
and (13.111) using the model-independent relations derived in Section 13.1. 
When the values found here for v and p are extrapolated to cl = 3 (that is, 
e = 1), the agreement with experiment is not spectacular, but the results at 
least suggest that the fixed point we have found here may be the continuation 
of the Wilson-Fisher fixed point to the vicinity of two dimensions. 
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Exact Solution for Large TV 


It is possible to obtain further insight into the nature of this fixed point 
by attacking the nonlinear sigma model using another approach. Since the 
nonlinear sigma model depends on a parameter N, the number of components 
of the unit vector, it is reasonable to ask how this model behaves as TV —> oo. 
We now show that if we take this limit holding g 2 N fixed, we can obtain an 
exact solution to the model with nontrivial behavior. 

The manipulations that lead to this solution are most clear if we work in 
Euclidean space, regarding the Lagrangian as the Boltzmann weight of a spin 
system. Then we must compute the functional integral 


Z = 


J Vn exp 



• - i )- 


(13.112) 


Here go is the bare value of the coupling constant, while the product of delta 
functions, one at each point, enforces the constraint. Introduce an integral 
representation of the delta functions; this requires a second functional integral 
over a Lagrange multiplier variable a(x): 


Z = 


J VaVn exp 





(13.113) 


Now the variable n is unconstrained and appears in the exponent only 
quadratically. Thus, we can integrate over n, to obtain 


Z= I Va (det[— d 2 + ia(x)]) exp 


> —— / d d xa . 
■ 2<?o J 


Va exp — — trlog(— d 2 + ia) H— [d c 
L 2 2^5 J 


x a 


(13.114) 


Since we are taking the limit TV —> oo with g^N held fixed, both terms 
in the exponent are of order TV. Thus it makes sense to evaluate the integral 
by steepest descents. This entails dominating the integral by the value of the 
function a(x) that minimizes the exponent. To determine this configuration, 
we compute the functional derivative of the exponent with respect to a(x). 
This gives the variational equation 


N , 1 

2 ^ - d 2 + ia 


1 

2 % 2 ' 


l*> = WT- 


(13.115) 


The left-hand side of this equation must be constant and real; thus, we should 
look for a solution in which a(x) is constant and pure imaginary. Write 


a(x) = —ini 2 ] 


(13.116) 


then m 2 obeys 


TV 


/ 


d d k 1 
(2?r) d k 2 + m 2 



(13.117) 
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We will study this equation first in d = 2. If we define the integral in 
(13.117) with a momentum cutoff, we can evaluate this integral and find the 
equation for m: 


N , A 1 

2tt ° g m gl' 


(13.118) 


We can make this equation finite by the renormalization 

1 1 AT A 

g'o g 2 + 2tt ° s m ’ 


(13.119) 


which introduces an arbitrary renormalization scale M. Then we can solve for 
m, to find 


m = M exp 



(13.120) 


This is a nonzero, 0(iV)-invariant mass term for the N unconstrained com¬ 
ponents of n. In this solution, (ft) = 0 and the symmetry is unbroken, for any 
value of g 2 or T. 

The solution of the theory does depend on the arbitrary renormalization 
scale M ; this dependence simply reflects the arbitrariness of the definition 
of the renormalized coupling constant. The statement that m follows unam¬ 
biguously from an underlying theory with fixed bare coupling and cutoff is 
precisely the statement that m obeys the Callan-Symanzik equation with no 
overall rescaling: 

[ M JM +0(9 2 )-^\m(g 2 ,M) = 0. (13.121) 


Using the large A' limit of (13.86), 


q a N 

Kg) = (13-122) 

it is easy to check that (13.121) is satisfied. Conversely, the validity of (13.121) 
with (13.122) tells us that Eq. (13.122) is an exact representation of the 3 
function to all orders in g 2 N in the limit of large N. The corrections to (13.122) 
are of order (1 /IV) or, equivalently, of order g 2 with no compensatory factor of 
N. Equation (13.122) agrees with our earlier calculation (13.86) to this order. 

Now let us redo this exercise in cl > 2. In this case, the integral in (13.117) 
diverges as a power of the cutoff. Even when the dependence on A is removed 
by renormalization, this change in behavior leads to a change in the depen¬ 
dence of the integral on m, which has important physical implications. 

It is not difficult to work out the integral in (13.117) as an expansion in 
(A/to). One finds: 


f d d k 1 f Cc A d ~ 2 - c 2 m d ~ 2 + ■■■ for d < 4, 

J (27r) d k 2 + m 2 ~ { C Y A d ~ 2 - C- 2 m 2 A d ~ 4 + ■■■ for d > 4, 


(13.123) 
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where C\. Co , Co are some functions of d. In particular, 

C x = [2 d - 1 7T (d+1 ^ 2 r(^)(d-2)] \ (13.124) 

In d > 4, the first derivative of the integral with respect to m' 2 is smooth as 
to 2 —1 0 ; this is the reason for the change in behavior. 

In the case d = 2, the left-hand side of (13.117) covered the whole range 
from 0 to oo as m was varied; thus, we could always find a solution for any 
value of < 7 q. In d > 2, this is no longer true. Equation (13.117) can be solved 
for m only if Ng} } is greater than the critical value 

Ng*={C 1 A d - 2 )- 1 . (13.125) 

Just at the boundary, m = 0. For bare couplings weaker than (13.125), it is 
possible to lower the value of the effective action by giving one component of 
n a vacuum expectation value while keeping the other components massless. 
Thus (13.125) is the criterion for the second-order phase transition in this 
model. Equation (13.124) implies that the critical value of g q, which is pro¬ 
portional to the critical temperature, goes to zero as d —>- 2 , in accord with 
our renormalization-group analysis. 

In the symmetric phase of the nonlinear sigma model, the mass m de¬ 
termines the exponential fall-off of correlations, so £ = m _1 . Thus we can 
determine the exponent v by solving for the dependence of m on the devia¬ 
tion of gl from the critical temperature. Write 

t= g °~ 9c . (13.126) 

9b 

Then, in 2 < d < 4, we can use (13.123) to solve (13.117) for m for small 
values of t. This gives 

-± T .t = Com d ~\ (13.127) 

n 9c 

which implies m ~ t v with 

v = —, 2 < d < 4. (13.128) 

u/ Z 

Similarly, 

d > 4. (13.129) 

The discontinuity in the dependence of v on d is exactly what we predicted 
from renormalization group analysis. For d > 4, the value of v goes over to 
the prediction of naive dimensional analysis. The value of v given by (13.128) 
is in precise agreement with (13.111), in the expansion e = d — 2, and with 
the N —>■ oo limit of (12.142), in the expansion e = 4 — d. Apparently, all of 
our results for critical exponents mesh in a very satisfying way. 

By combining all of our results, we arrive at a pleasing picture of the be¬ 
havior of scalar field theory as a function of spacetime dimensionality. Above 
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four dimensions, any scalar field interaction is irrelevant and the expected 
behavior is trivial. Just at four dimensions, the coupling constant tends to 
zero only logarithmically at large scale, giving rise to a renormalizable the¬ 
ory with predictions such as those in Section 13.2. Below four dimensions, 
the theory is intrinsically a theory of interacting scalar fields, dominated by 
the Wilson-Fisher fixed point. The coupling at this fixed point is small near 
four dimensions but grows large as the dimensionality decreases. Finally (for 
N > 2), as d —> 2, the fixed-point theory approaches the weak-coupling limit 
of a completely different Lagrangian with the same symmetries, the nonlinear 
sigma model. 

This evolution of the behavior of the model as a function of d illustrates 
the main point of the previous two chapters: The qualitative behavior of a 
quantum field theory is determined not by the fundamental Lagrangian, but 
rather by the nature of the renormalization group flow and its fixed points. 
These, in turn, depend only on the basic symmetries that are imposed on the 
family of Lagrangians that flow into one another. This conclusion signals, at 
the deepest level, the importance of symmetry principles in determining the 
fundamental laws of physics. 


Problems 


13.1 Correction-to-scaling exponent. For critical phenomena in 4 —e dimensions, 
the irrelevant contributions that disappear most slowly are those associated with the 
deviation of the coupling constant A from its fixed-point value. This gives the most im¬ 
portant nonuniversal correction to the scaling laws derived in Section 13.1. By studying 
the solution of the Callan-Symanzik equation, show that if the bare value of A differs 
slightly from A*, the Gibbs free energy receives a correction 

G(M,t) ->• G(M,t ) • (1 + (A - A..K~'7,-(M/ 1 


This formula defines a new critical exponent ui, called the correction-to-scaling expo¬ 
nent. Show that 


0J = 1\ /3 


e + 0(e 2 ). 


13.2 The exponent r/. By combining the result of Problem 10.3 with an appropriate 
renormalization prescription, show that the leading term in y(A) in <j > 4 theory is 

A 2 

7 “ 12(4?r) 4 ’ 

Generalize this result to the 0(W)-symmetric <f> A theory to derive Eq. (13.47). Compute 
the leading-order (e 2 ) contribution to ?/. 

13.3 The CP N model. The nonlinear sigma model discussed in the text can be 
thought of as a quantum theory of fields that are coordinates on the unit sphere. 
A slightly more complicated space of high symmetry is complex projective space, 
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CP N . This space can be defined as the space of (A r + l)-dimensional complex vectors 
(zi ,..., zjv+l) subject to the condition 

j 

with points related by an overall phase rotation identified, that is, 

, e* a ;jv + 1 ) identified with (cy,..., zjv+i)- 

In this problem, we study the two-dimensional quantum field theory whose fields are 
coordinates on this space. 

(a) One way to represent a theory of coordinates on C'P A is to write a Lagrangian 
depending on fields zj(x ), subject to the constraint, which also has the local 
symmetry 



independently at each point x. Show that the following Lagrangian has this 
symmetry: 

^ = + \ z jdftZj\ ]• 

To prove the invariance, you will need to use the constraint on the Zj, and its 
consequence 

Show that the nonlinear sigma model for the case N = 3 can be converted to 
the C'P A model for the case N = 1 by the substitution 

n* = z*a i z , 

where <r* are the Pauli sigma matrices. 

(b) To write the Lagrangian in a simpler form, introduce a scalar Lagrange multiplier 
A which implements the constraint and also a vector Lagrange multiplier ,4^, to 
express the local symmetry. More specifically, show that the Lagrangian of the 
CP N model is obtained from the Lagrangian 

C — -p[\D^Zj\ — \(\zj\ —1)], 

where D ^ = [0^ + by functionally integrating over the fields A and ^4^,. 

(c) We can solve the C'P A model in the limit N —S- oo by integrating over the fields 
Zj . Show that this integral leads to the expression 

-N trlog(-D 2 — A) + 

where we have kept only the leading terms for N —> oo, g 2 N fixed. Using meth¬ 
ods similar to those we used for the nonlinear sigma model, examine the condi¬ 
tions for minimizing the exponent with respect to A and A fl . Show that these 
conditions have a solution at A M = 0 and A = m 2 > 0. Show that, if g 2 is 
renormalized at the scale M, m can be written as 


Z= VAVXexp 


J d 2 x\ 
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(d) Now expand the exponent about A fl = 0. Show that the first nontrivial term 
in this expansion is proportional to the vacuum polarization of massive scalar 
fields. Evaluate this expression using dimensional regularization, and show that 
it yields a standard kinetic energy term for A fl . Thus the strange nonlinear 
field theory that we started with is finally transformed into a theory of (N + 1) 
massive scalar fields interacting with a massless photon. 



Final Project 


The Coleman-Weinberg Potential 


In Chapter 11 and Section 13.2 we discussed the effective potential for an 
0{ IV)-symmetrid < i > 4 theory in four dimensions. We computed the perturbative 
corrections to this effective potential, and used the renormalization group to 
clarify the behavior of the potential for small values of the scalar field mass. 
After all this work, however, we found that the qualitative dependence of the 
theory on the mass parameter was unchanged by perturbative corrections. 
The theory still possessed a second-order phase transition as a function of 
the mass. The loop corrections affected this picture only in providing some 
logarithmic corrections to the scaling behavior near the phase transition. 

However, loop corrections are not always so innocuous. For some sys¬ 
tems, they can change the structure of the phase transition qualitatively. This 
Final Project treats the simplest example of such a system, the Coleman- 
Weinberg model. The analysis of this model draws on a broad variety of topics 
discussed in Part II; it provides a quite nontrivial application of the effec¬ 
tive potential formalism and the use of the renormalization group equation. 
The phenomenon displayed in this exercise reappears in many contexts, from 
displacive phase transitions in solids to the thermodynamics of the early uni¬ 
verse. 

This problem makes use of material in starred sections of the book, in 
particular, Sections 11.3, 11.4, and 13.2. Parts (a) and (e), however, depend 
only on the unstarred material of Part II. We recommend part (e) as excellent 
practice in the computation of renormalization group functions. 

The Coleman-Weinberg model is the quantum electrodynamics of a scalar 
field in four dimensions, considered for small values of the scalar field mass. 
The Lagrangian is 

£ = + i/l„or/T'o - mVo - 

where <p(x ) is a complex-valued scalar field and = (d tI + ieA^)<p. 

(a) Assume that to 2 = — /r < 0, so that the symmetry <j>(x) —>■ t "'oi.r) is 
spontaneously broken. Write out the expression for £, expanded around 
the broken-symmetry state by introducing 

1 
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where <f>o, a(x), and n are real-valued. Show that the A t , field acquires a 
mass. This mechanism of mass generation for vector fields is called the 
Higgs mechanism. We will study it in great detail in Chapter 20. 

(b) Working in Landau gauge (d ,l A tl = 0), compute the one-loop correction 
to the effective potential V{(f>ci)- Show that it is renormalized by counter¬ 
terms for m 2 and A. Renormalize by minimal subtraction, introducing a 
renormalization scale M . 

(c) In the result of part (b), take the limit /r —)■ 0. The result should be 
an effective potential that is scale-invariant up to logarithms containing 
M. Analyze this expression for A very small, of order (e 2 ) 2 . Show that, 
with this choice of coupling constants, V{(f> c i) has a symmetry-breaking 
minimum at a value of <f>ci for which no logarithm is large, so that a 
straightforward perturbation theory analysis should be valid. Thus the 
/i 2 = 0 theory, for this choice of coupling constants, still has sponta¬ 
neously broken symmetry, due to the influence of quantum corrections. 

(d) Sketch the behavior of V{<f> c i) as a function of m 2 , on both sides of mr = 0, 
for the choice of coupling constants made in part (c). 

(e) Compute the Callan-Symanzik 3 functions for e and A. You should find 

A = 4^’ ^ = ^y(5A 2 -18e 2 A + 54e 4 ). 

Sketch the renormalization group flows in the (A,e 2 ) plane. Show that 
every renormalization group trajectory passes through the region of cou¬ 
pling constants considered in part (c). 

(f) Construct the renormalization-group-improved effective potential at p 2 = 
0 by applying the results of part (e) to the calculation of part (c). Com¬ 
pute (<f>) and the mass of the a particle as a function of A, e 2 , M. Compute 
the ratio m a /mA to leading order in e 2 , for ACe 2 . 

(g) Include the effects of a nonzero m 2 in the analysis of part (f). Show that 
m a /tua takes a minimum nonzero value as m 2 increases from zero, before 
the broken symmetry state disappears entirely. Compute this value as a 
function of e 2 , for A <C e 2 . 

(h) The Lagrangian of this problem (in its Euclidean form) is equivalent to 
the Landau free energy for a superconductor in d dimensions, coupled 
to an electromagnetic field. This expression is known as the Landau- 
Ginzburg free energy. Compute the 3 functions for this system and sketch 
the renormalization group flows for d = 4 — e. Describe the qualitative 
behavior you would expect for the superconducting phase transition in 
three dimensions. (For realistic superconductors, the value of e 2 —after it 
is made dimensionless in the appropriate way—is very small. The effect 
you will find is expected to be important only for |T — Tc\/Tc < 10 -5 .) 
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Non-Abelian Gauge Theories 




Chapter 14 


Invitation: The Parton Model 
of Hadron Structure 


In Part II of this book, we explored the structure of quantum field theories in a 
formal way. We developed sophisticated calculational algorithms (Chapter 10), 
derived a formalism for the extraction of scaling laws and asymptotic behavior 
(Chapter 12), and worked out some of the consequences of spontaneously 
broken symmetry (Chapter 11). Much of this formalism turned out to have 
unexpected applications in statistical mechanics. However, we have not yet 
investigated its implications for elementary particle physics. To do so, we must 
first ask which particular quantum field theories describe the interactions of 
elementary particles. 

Since the mid-1970s, most high-energy physicists have agreed that the 
elementary particles that make up matter are a set of fermions, interacting 
primarily through the exchange of vector bosons. The elementary fermions 
include the leptons (the electron, its heavy counterparts p and r, and a neu¬ 
tral, almost massless neutrino corresponding to each of these species), and 
the quarks , whose bound states form the particles with nuclear interactions, 
mesons and baryons (collectively called hadrons). These fermions interact 
through three forces: the strong, weak, and electromagnetic interactions. Of 
these, the strong interaction is responsible for nuclear binding and the inter¬ 
actions of the constituents of nuclei, while the weak interaction is responsible 
for radioactive beta decay processes. The electromagnetic interaction is the 
familiar Quantum Electrodynamics, coupled minimally to all charged quarks 
and leptons. It is not clear that these three forces suffice to explain the most 
subtle properties of the elementary fermions—we will discuss this question 
in Chapter 20—but these three forces are certainly the most prominent. All 
three are now understood to be mediated by the exchange of vector bosons. 
The equations describing the electromagnetic interaction were discovered by 
Maxwell, and their quantum mechanical implications have been treated in de¬ 
tail in Part I. The correct theories of the weak and strong interactions were 
discovered much later. 

By the late 1950s, studies of the helicity dependence of weak interaction 
cross sections and decay rates had shown that the weak interaction involves 
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a coupling of vector currents built of quark and lepton fields.* It was thus 
natural to assume that the weak interaction is due to the exchange of very 
heavy vector bosons, and indeed, such bosons, the W and Z particles, were 
discovered in experiments at CERN in 1982. But a complete theory of the 
weak interaction must include not only the correct couplings of the bosons 
to fermions, but also the equations of motion of the boson fields themselves, 
the analogue, for the W and Z, of Maxwell’s equations. Finding the correct 
form of these equations was not straightforward, because Maxwell’s equations 
prohibit the generation of a mass for the vector particle. The proper reconcili¬ 
ation of the generalized Maxwell equations with the nonzero W and Z masses 
turned out to require incorporating into the theory a spontaneously broken 
symmetry. Chapters 20 and 21 treat this subject in some detail, describing 
the interplay of vector field theories with spontaneously broken symmetry. 
This interplay leads to new twists and new phenomena, beyond those dis¬ 
cussed in our treatment of spontaneous symmetry breaking in Chapter 11. 
A complete theory of the weak interaction also requires the simultaneous in¬ 
corporation of the electromagnetic interaction, forming a unified structure as 
first hypothesized by Glashow, Weinberg, and Salam. 

On the other hand, it was for a long time completely obscure that a theory 
of exchanged vector bosons could correctly describe the strong interaction. 
Part of the mystery was that quarks do not exist as isolated species. Their 
existence, and eventually their quantum numbers, had to be deduced from the 
spectrum of observable strongly interacting particles. But, in addition, there 
were complications due to the fact that the strong interactions are strong. 
The Feynman diagram expansion assumes that the coupling constant is small; 
when the coupling becomes strong, a large number of diagrams are important 
(if the series converges at all) and it becomes impossible to pick out the 
contributions of the elementary interaction vertices. The crucial clue that the 
strong interactions have a vector character arose from what at first seemed 
to be just another mystery, the observation that the strong interactions turn 
themselves off when the momentum transfer is large, in a sense that we will 
now describe. 

Almost Free Partons 

In Section 5.1 we computed the cross section for the QED process e + e _ —> 
■ We then remarked that the corresponding cross section for e + e _ anni¬ 
hilation into hadrons could be computed in the same way, using a simplistic 
model in which the quarks are treated as noninterating fermions. This method 
gives a surprisingly accurate formula for the cross section, capturing its most 
important qualitative features. But we deferred the explanation of this puz¬ 
zle: How can a model of noninteracting quarks represent the behavior of a 
force that, under other circumstances, is extremely strong I 

*For an overview of weak interaction phenomenology, see Perkins (1987), Chap¬ 
ter 7, or any other modern particle physics text. 
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In fact, there are many circumstances in the study of the strong interaction 
at high energy in which this force has an unexpectedly weak effect. Historically, 
the first of these appeared in proton-proton collisions. At high energy, above 
10 GeV or so in the center of mass, collisions of protons (or any other hadrons) 
produce large numbers of pions. One might have imagined that these pions 
would fill all of the allowed phase space, but, in fact, they are mainly produced 
with momenta almost collinear with the collision axis. The probability of 
producing a pion with a large component of momentum transverse to the 
collision axis falls off exponentially in the value of this transverse momentum, 
suppressing the production substantially for transverse momenta greater than 
a few hundred MeV. 

This phenomenon of limited transverse momentum led to a picture of a 
hadron as a loosely bound assemblage of many components. In this picture, a 
proton struck by another proton would be torn into a cloud of pieces. These 
pieces would have momenta roughly collinear with the original momentum 
of the proton and would eventually reform into hadrons moving along the 
collision axis. By hypothesis, these pieces could not absorb a large momentum 
transfer. We can characterize this hypothesis mathematically as follows: In 
a high-energy collision, the momenta of the two initial hadrons are almost 
lightlike. The shattered pieces of the hadrons, arrayed along the collision axis, 
also have lightlike momenta parallel to the original momentum vectors. This 
final state can be produced by exchanging momenta q among the pieces in 
such a way that, though the components of q might be large, the invariant 
q 2 is always small. The ejection of a hadron at large transverse momentum 
would require large (spacelike) q 2 , but such a process was very rare. Thus it 
was hypothesized that hadrons were loose clouds of constituents, like jelly, 
which could not absorb a large q 2 . 

This picture of hadronic structure was put to a crucial test in the late 
1960s, in the SLAC-MIT deep inelastic scattering experiments.* In these ex¬ 
periments, a 20 GeV electron beam was scattered from a hydrogen target, and 
the scattering rate was measured for large deflection angles, corresponding to 
large invariant momentum transfers from the electron to a proton in the tar¬ 
get. The large momentum transfer was delivered through the electromagnetic 
rather than the strong interaction, so that the amount of momentum delivered 
could be computed from the momentum of the scattered electron. In models 
in which hadrons were complex and softly bound, very low scattering rates 
were expected. 

Instead, the SLAC-MIT experiments saw a substantial rate for hard scat¬ 
tering of electrons from protons. The total reaction rate was comparable to 
what would have been expected if the proton were an elementary particle scat¬ 
tering according to the simplest expectations from QED. However, only in rare 
cases did a single proton emerge from the scattering process. The largest part 


^For a description of these experiments and tlieir ramifications, see J. I. Friedman, 
H. W. Kendall, and R. E. Taylor, Rev. Mod. Phvs. 63, 573 (1991). 
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of the rate came from the deep inelastic region of phase space, in which the 
electromagnetic impulse shattered the proton and produced a system with a 
large number of hadrons. 

How could one reconcile the presence of electromagnetic hard scattering 
processes with the virtual absence of hard scattering in strong interaction pro¬ 
cesses? To answer this question, Bjorken and Feynman advanced the following 
simple model, called the parton model. Assume that the proton is a loosely 
bound assemblage of a small number of constituents, called partons. These 
include quarks (and antiquarks), which are fermions carrying electric charge, 
and possibly other neutral species responsible for their binding. By assump¬ 
tion, these constituents are incapable of exchanging large momenta q 2 through 
the strong interactions. However, the quarks have the electromagnetic inter¬ 
actions of elementary fermions, so that an electron scattering from a quark 
can knock it out of the proton. The struck quark then exchanges momentum 
softly with the remainder of the proton, so that the pieces of the proton ma¬ 
terialize as a jet of hadrons. The produced hadrons should be collinear with 
the direction of the original struck parton. 

The parton model, incomplete though it is, imposes a strong constraint 
on the cross section for deep inelastic electron scattering. To derive this con¬ 
straint, consider first the cross section for electron scattering from a single 
constituent quark. We discussed the related process of electron-muon scat¬ 
tering in Section 5.4, and we can borrow that result. Since we imagine the 
reaction to occur at very high energy, we will ignore all masses. The square of 
the invariant matrix element in the massless limit is written in a simple form 
in Eq. (5.71): 


\ E 

spins 


8 e 4 Q 2 ( s 2 + u 2 \ 
~1*V 4 )' 


(14.1) 


where s, t,u are the Mandelstam variables for the electron-quark collision and 
Qi is the electric charge of the quark in units of \e\. Recall from Eq. (5.73) that, 
for a collision involving massless particles, s + t. + u = 0. Then the differential 
cross section in the center of mass system is 

da 1 1 8e 4 Q 2 ( s 2 + u 2 \ 

dcos#cM 2s 167r t 2 \ 4 ) 

_ ira 2 Q 2 ^ s 2 + (s -f t) 2 ^ 

Or, since t = — s(l — cos#cm)/2, 

da _ 2na 2 Q 2 Is 2 + (s + f) 2 \ 
di s 2 \ t 2 ) 

To make use of this result, we must relate the invariants s and t to ex¬ 
perimental observables of electron-proton inelastic scattering. The kinematic 
variables are shown in Fig. 14.1. The momentum transfer q from the electron 


(14.2) 


(14.3) 
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Figure 14.1. Kinematics of deep inelastic electron scattering in the parton 
model. 

can be measured by measuring the final momentum and energy of the elec¬ 
tron, without using any information from the hadronic products. Since q ,J is 
a spacelike vector, one conventionally expresses its invariant square in terms 
of a positive quantity Q, with 

Q 2 = ~q 2 . (14.4) 

Then the invariant t is simply — Q 2 . 

Expressing s in terms of measurable quantities is more difficult. If the 
collision is viewed from the electron-proton center of mass frame, and we 
visualize the proton as a loosely bound collection of partons (and continue 
to ignore masses), we can characterize a given parton by the fraction of the 
proton’s total momentum that it carries. We denote this longitudinal fraction 
by the parameter £, with 0 < £ < 1. For each species i of parton, for example, 
up-tvpe quarks with electric charge Qi = +2/3, there will be a function /,;(£) 
that expresses the probability that the proton contains a parton of type i and 
longitudinal fraction £. The expression for the total cross section for electron- 
proton inelastic scattering will contain an integral over the value of £ for the 
struck parton. The momentum vector of the parton is then p = £P, where 
P is the total momentum of the proton. Thus, if k is the initial electron 
momentum, 

s = (p + k) 2 = 2p-k = 2f_P -k = f,s, (14.5) 

where s is the square of the electron-proton center of mass energy. 

Remarkably, £ can also be determined from measurements of only the 
electron momentum, if one makes the assumption that the electron-parton 
scattering is elastic. Since the scattered parton has a mass small compared to 
s and Q 2 , 


o ~(p + q) 2 -- 

= 2p ■ q + q 2 = 2fP ■ q — Q 2 . 

(14.6) 

£ = x, 

Q 2 

where x = ———. 

2 P-q 

(14.7) 


^From each scattered electron, one can determine the values of Q 2 and 
x for the scattering process. The parton model then predicts the event dis¬ 
tribution in the x-Q 2 plane. Using the parton distribution functions /,:(£), 
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Figure 14.2. Test of Bjorken scaling using the e~p deep inelastic scattering 
cross sections measured by the SLAC-MIT experiment, J. S. Poucher, et. ah, 
Phys. Rev. Lett. 32, 118 (1974). We plot d 2 a/dxdQ 2 divided by the factor 
(14.9) against x, for the various initial electron energies and scattering angles 
indicated. The data span the range 1 GeV 2 < Q 2 < 8 GeV 2 . 


evaluated at £ 
tion 


= x, and the cross-section formula (14.3), we find the distribu¬ 


ter 
dxdQ 2 




2ira 2 


i + 



(14.8) 


The distribution functions fi(x) depend on the details of the structure of 
the proton and it is not known how to compute them from first principles. 
But formula (14.8) still makes a striking prediction, that the deep inelastic 
scattering cross section, when divided by the factor 


1 + (1 - Q 2 /xs) 2 

Q 1 


(14.9) 


to remove the kinematic dependence of the QED cross section, gives a quantity 
that depends only on x and is independent of Q 2 . This behavior is known as 
Bjorken scaling. Indeed, the data from the SLAC-MIT experiment exhibited 
Bjorken scaling to about 10% accuracy for values of Q above 1 GeV, as shown 
in Fig. 14.2. 

Bjorken scaling is, essentially, the statement that the structure of the 
proton looks the same to an electromagnetic probe no matter how hard the 
proton is struck. In the frame of the proton, the energy of the exchanged 
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virtual photon is 


Pq 

m 


Q~ 

2 xm ’ 


(14.10) 


where to is the proton mass. The reciprocal of this energy transfer is, roughly, 
the duration of the scattering process as seen by the components of the pro¬ 
ton. This time should be compared to the reciprocal of the proton mass, which 
is the characteristic time over which the partons interact. The deep inelastic 
regime occurs when q° > to, that is, when the scattering is very rapid com¬ 
pared to the normal time scales of the proton. Bjorken scaling implies that, 
during such a rapid scattering process, interactions among the constituents of 
the proton can be ignored. We might imagine that the partons are approxi¬ 
mately free particles over the very short times scales corresponding to energy- 
transfers of a GeV or more, though they have strong interactions on longer 
time scales. 


Asymptotically Free Partons 

The picture of the proton structure implied by Bjorken scaling was beautifully 
simple, but it raised new, fundamental questions. In quantum field theory, 
fermions interact by exchanging virtual particles. These virtual particles can 
have arbitrarily high momenta, hence the fluctuations associated with them 
can occur on arbitrarily short time scales. Quantum field theory processes do 
not turn themselves off at short times to reveal free-particle equations. Thus 
the discovery of Bjorken scaling suggested a conflict between the observation 
of almost free partons and the basic principles of quantum field theory. 

The resolution of this paradox came from the renormalization group. In 
Chapter 12 we saw that coupling constants vary with distance scale. In QED 
and ( j ) 4 theory, we found that the couplings become strong at large momenta 
and weak at small momenta. However, we noted the possibility that, in some 
theories, the coupling constant could have the opposite behavior, becoming 
strong at small momenta or large times but weak at large momenta or short 
times. We referred to such behavior as asymptotic freedom. Section 13.3 dis¬ 
cussed an example of an asymptotically free quantum field theory, the nonlin¬ 
ear sigma model in two dimensions. The problem posed in the previous para¬ 
graph would be resolved if there existed a suitable asymptotically free quan¬ 
tum field theory in four dimensions that could describe the interaction and 
binding of quarks. Then, at least to some level of approximation, the strong in¬ 
teraction described by this theory would turn off in large-momentum-transfer 
or short-time processes. 

At the time of the discovery of Bjorken scaling, no asymptotically free field 
theories in four dimensions were known. Then, in the early 1970s, ‘t Hooft, 
Politzer, Gross, and Wilczek discovered a class of such theories. These are 
the non-Abelian gauge theories: theories of interacting vector bosons that 
can be constructed as generalizations of quantum electrodynamics. It was 
subsequently shown that these are the only asymptotically free field theories 
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in four dimensions. This discovery gave the crucial clue for the construction 
of the fundamental theory of the strong interactions. Apparently, the quarks 
are bound together by interacting vector bosons (called gluons ) of precisely 
this type. 

However, these gauge theories cannot precisely reproduce the expecta¬ 
tions of strict Bjorken scaling. The differences between the free parton model 
and the quantum field theory model with asymptotic freedom appear when 
one moves to a higher level of accuracy in measurements of deep inelastic 
scattering and other strong interaction processes involving large momentum 
transfer. In an asymptotically free quantum field theory, the coupling con¬ 
stant is still nonzero at any finite momentum transfer. In fact, the final evo¬ 
lution of the coupling to zero is very slow, logarithmic in momentum. Thus, 
at some level, one must find small corrections to Bjorken scaling, associated 
with the exchange or emission of high-momentum gluons. Similarly, the other 
qualitative simplifications of hadron physics at high momentum transfer—for 
example, the phenomenon of limited transverse momentum in hadron-hadron 
collisions—should be only approximate, receiving corrections due to gluon ex¬ 
change and emission. Thus the predictions of an asymptotically free theory of 
the strong interaction are twofold. On one hand, such a theory predicts quali¬ 
tative simplifications of behavior at high momentum. But, on the other hand, 
such a theory predicts a specific pattern of corrections to this behavior. 

In fact, particle physics experiments of the 1970s revealed precisely this 
picture. Bjorken scaling was found to be only an approximate relation, show¬ 
ing violations that correspond to a slow evolution of the parton distribu¬ 
tions fi(x ) over a logarithmic scale in Q 2 . The rate of particle production in 
hadron-hadron collisions was found to decrease only as a power rather than 
exponentially at very large values of the transverse momentum, and the par¬ 
ticles produced at large transverse momentum were shown to be associated 
with jets of hadrons created by the soft evolution of a hard-scattered quark 
or gluon. Most remarkably, the forms of the cross sections found for these and 
other deviations from scaling did, finally, give direct evidence for the vector 
character of the elementary field that mediates the strong interaction. 

We will review all of these phenomena in Chapter 17, as we study the 
particular gauge theory that describes the strong interactions. First, however, 
we must learn how to construct non-Abelian gauge theories and how to work 
out their predictions using Feynman diagrams. Throughout our analysis of 
these theories, the renormalization group will play an essential role. One of 
the very beautiful aspects of the study of non-Abelian gauge theories is the way 
in which the most powerful general ideas of quantum field theory acquire even 
more strength as they intertwine with the specific features of these particular, 
intricately built models. This interplay between general principles and the 
specific features of gauge theories will be the major theme of Part III of this 
book. 
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Non-Abelian Gauge Invariance 


So far in this book we have worked with a rather limited class of quantum fields 
and interactions, restricting our attention to scalar field theories, Yukawa the¬ 
ory, and Quantum Electrodynamics. It is hardly surprising that these theories 
are not sufficient to describe all of the known interactions of elementary par¬ 
ticles. But what other theories are possible, given that the Lagrangian of a 
renormalizable theory can contain no terms of mass dimension higher than 4? 

The most natural theories to try next would be ones with interactions 
among vector fields, of the form or A 4 . Sensible theories of this 

type are difficult to construct, however, because of the negative-norm states 
produced by the time component A 0 of the vector field operator. In Section 5.5 
we saw that these negative-norm states cause no difficulty in QED: They are 
effectively canceled out by the longitudinal polarization states, by virtue of 
the Ward identity. The Ward identity, in turn, follows from the invariance of 
the QED Lagrangian under local gauge transformations. Perhaps, then, if we 
can generalize the principle of local gauge invariance, it will lead us to the 
construction of other sensible theories of vector particles. 

The goal of this chapter is to do just that. First we will return briefly to 
the study of QED, this time taking the gauge symmetry to be fundamental 
and deriving the rest of the theory from this principle. Then, in Section 15.2, 
we will see that the gauge invariance of electrodynamics is only the most 
trivial example of an infinite-parameter symmetry, and that the more gen¬ 
eral examples lead to other interesting Lagrangians. These field theories, the 
first of which was constructed by Yang and Mills,* generalize electrodynamics 
in a profound way. They are theories of multiple vector particles, whose in¬ 
teractions are strongly constrained by the symmetry principle. In subsequent 
chapters we will study the quantization of these theories and their application 
to the real world of elementary particle physics. 


*C. N. Yang and R. Mills, Phvs. Rev. 96, 191 (1954). 
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15.1 The Geometry of Gauge Invariance 

In Section 4.1 we wrote down the Lagrangian of Quantum Electrodynamics 
and noted the curious fact that it is invariant under a very large group of 
transformations (4.6), allowing an independent symmetry transformation at 
every point in spacetime. This invariance is the famous gauge symmetry of 
QED. From the modern viewpoint, however, gauge symmetry is not an in¬ 
cidental curiosity, but rather the fundamental principle that determines the 
form of the Lagrangian. Let us now review the elements of the theory, taking 
the modern viewpoint. 

We begin with the complex-valued Dirac field tp(x), and stipulate that 
our theory should be invariant under the transformation 

> e iQ( ' r VGc). (15.1) 

This is a phase rotation through an angle a(x) that varies arbitrarily from 
point to point. How can we write a Lagrangian that is invariant under this 
transformation? As long as we consider terms in the Lagrangian that have no 
derivatives, this is easy: We simply write the same terms that are invariant to 
global phase rotations. For example, the fermion mass term 

mi/>if}{x) 

is permitted by global phase invariance, and the local invariance gives no 
further restriction. 

The difficulty arises when we try to write terms including derivatives. The 
derivative of ip(x) in the direction of the vector is defined by the limiting 
procedure 

n' 1 u.,i ■ =lim - \tp(x + en) — tp(x) 1. (15.2) 

e->0 e L J 

However, in a theory with local phase invariance, this definition is not very 
sensible, since the two fields that are subtracted, tp(x + en) and tp(x), have 
completely different transformations under the symmetry (15.1). The quantity 
d fl ip, in other words, has no simple tranformation law and no useful geometric 
interpretation. 

In order to subtract the values of ip{x) at neighboring points in a mean¬ 
ingful way, we must introduce a factor that compensates for the difference in 
phase transformations from one point to the next. The simplest way to do 
this is to define a scalar quantity U(y,x ) that depends on the two points and 
has the transformation law 

i'(y.x) > e iaiv) U{y,x)e- ia{x) (15.3) 

simultaneously with (15.1). At zero separation, we set U(y, y ) = 1; in general, 
we can require U(y,x) to be a pure phase: U(y,x ) = exp[-i<f)(y,;r)]. With this 
definition, the objects ip(y) and U(y,x)tp(x) have the same transformation 
law, and we can subtract them in a manner that is meaningful despite the 
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local symmetry. Thus we can define a sensible derivative, called the covariant 
derivative, as follows: 

n" n,,r =lim - \ih{x + en) — U(x + en, x)ip(x)]. (15-4) 

c e 

To make this definition explicit, we need an expression for the comparator 
U(y,x ) at infinitesimally separated points. If the phase of U(y,x ) is a contin¬ 
uous function of the positions y and x, then U(y. x) can be expanded in the 
separation of the two points: 

U(x + en,x ) = 1 — ie en 1 ' A^px) + 0(e 2 ). (15.5) 

Here we have arbitrarily extracted a constant e. The coefficient of the dis¬ 
placement en 1 * is a new vector field A fl (x). Such a field, which appears as 
the infinitesimal limit of a comparator of local symmetry transformations, is 
called a connection. The covariant derivative then takes the form 

Dijip(x) = d jX ip{x) +ieA /l ip(x). (15.6) 

By inserting (15.5) into (15.3), one finds that A tJ transforms under this local 
gauge transformation as 

A^{x) -> A^{x) - ^d M a(x). (15.7) 

To check that all of these expressions are consistent, we can transform D fl ip(x) 
according to Eqs. (15.1) and (15.7): 

D»i>(x) -t [9 m + «(( - \ d » a )] e ,a(x) ip(x) ^ ^ 

= ***<*> (d„ + ieA lt )i/,(x ) = i^D^ix). 

Thus the covariant derivative transforms in the same way as the field ip, 
exactly as we constructed it to in the original definition (15.4). 

We have now recovered most of the familiar ingredients of the QED La- 
grangian. From our current viewpoint, however, the definition of the covariant 
derivative and the transformation law for the connection A fl follow from the 
postulate of local phase rotation symmetry. Even the very existence of the 
vector field A fl is a consequence of local symmetry: Without it we could not 
write an invariant Lagrangian involving derivatives of ip. 

More generally, our present analysis gives us a way to construct all pos¬ 
sible Lagrangians that are invariant under the local symmetry. In any term 
with derivatives of ip, replace these with covariant derivatives. According to 
Eq. (15.8), these transform in exactly the same manner as ip itself. Therefore 
any combination of ip and its covariant derivatives that is invariant under a 
global phase rotation (and only these combinations) will also be locally in¬ 
variant. 

To complete the construction of a locally invariant Lagrangian, we must 
find a kinetic energy term for the field A a locally invariant term that de¬ 
pends on A^ and its derivatives, but not on ip. This term can be constructed 
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Figure 15.1. Construction of the field strength by comparisons around a 
small square in the (1,2) plane. 


either integrally, from the comparator U(y,x), or infinitesimally, from the co¬ 
variant derivative. 

Working from U(y,x ), we will need to extend our explicit formula (15.5) 
to the next term in the expansion in e. Using the assumption that U(y,x ) is 
a pure phase and the restriction (U(x,y)Y = U(y,x ), it follows that 

U(x + en,x) = exp [— iein^A^x + %n) + 0(e 3 )] • (15.9) 

(Relaxing these restrictions introduces additional vector fields into the theory; 
this is an unnecessary complication.) Using this expansion for U(y,x ), we 
link together comparisons of the phase direction around a small square in 
spacetime. For definiteness, we take this square to lie in the (1, 2)-plane, as 
defined by the unit vectors 1, 2 (see Fig. 15.1). Define U(x) to be the product 
of the four comparisons around the corners of the loop: 

U(x) = U(x,x + e2)U(x 4- e2,x + ei 4- e2) 

„ „ „ \ (15.10) 

x U(x + el + e2, x + el )U(x + el, x). 

The transformation law (15.3) for U implies that U(x) is locally invariant. In 
the limit e —I 0, it will therefore give us a locally invariant function of A^. To 
find the form of this function, insert the expansion (15.9) to obtain 

U(x) = exp | — iee[—A- 2 (x -F 42) — A\(x 4- f 1 4- e2) 

, (15-11) 

4* A -2 (a; 4- el 4- |2) + Ai (x + 41)] 4* 0(e ) j. 

When we expand the exponent in powers of e, this reduces to 

U(x) = 1 — -je 2 e[9iAo(:c) — 9 o-4i(.t)] + 0(e s ). (15.12) 

Therefore the structure 

F,t V = d fl A„ - d v A t , (15.13) 

is locally invariant. Of course, is the familiar electromagnetic field tensor, 
and its invariance under (15.7) can be checked directly. The preceding con¬ 
struction, however, shows us the geometrical origin of the structure of F 
Any function that depends on A tJ only through F^ and its derivatives is lo¬ 
cally invariant. More general functions, such as the vector field mass term 
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A fl A^, transform under (15.7) in ways that cannot be compensated and thus 
cannot appear in an invariant Lagrangian. 

A related argument for the invariance of F fw can be made using the co¬ 
variant derivative. We have seen above that, if a field has the local transfoma- 
tion law (15.1), then its covariant derivative has the same transformation law. 
Thus the second covariant derivative of ip also transforms according to (15.1). 
The same conclusion holds for the commutator of covariant derivatives: 

[D„,D^b(x) ->■ e‘ aM [ D ^D^p(x). (15.14) 

However, the commutator is not itself a derivative at all: 

[Du, D v \ip = [d^,d„]ip + «e([<V A v \ - [d v , A„])ip - e 2 [A^,A v ]ip 

, (15.15) 

= /'< (cl,,. 1,. - (l,..l,,) • ip. 

That is, 

[D^,D V \ = ieF^. (15.16) 

On the right-hand side of (15.14), the factor ip(x) accounts for the entire 
transformation law, so the multiplicative factor F^ v must be invariant. One 
can visualize the commutator of covariant derivatives as the comparison of 
comparisons across a small square; fundamentally, therefore, this argument is 
equivalent to that of the previous paragraph. 

Whatever the method of proving the invariance of F^, we have now 
assembled all of the ingredients we need to write the most general locally 
invariant Lagrangian for the electron field ip and its associated connection A^. 
This Lagrangian must be a function of ip and its covariant derivatives, and of 
F t , v and its derivatives, and must be invariant to global phase transformations. 
Up to operators of dimension 4, there are only four possible terms: 

U = - ce a ^‘ v F a0 F^ - m^. (15.17) 

By adjusting the normalization of the fields ip and A we have set the coeffi¬ 
cients of the first two terms to their standard values. This normalization of A tJ 
requires the arbitrary scale factor e in our original definition (15.5) of A fl . The 
third term violates the discrete symmetries P and T, so we may exclude it if 
we postulate these symmetries.! Then £ 4 contains only two free parameters, 
the scale factor e and the coefficient m. 

By using operators of dimension 5 and 6, we can form many additional 
gauge-invariant combinations: 

£e = iciipa^F^ip + Cniipip) 2 + C 3 {ipj°ip) 2 H-. (15.18) 

More allowed terms appear at each higher order in mass dimension. But all 
of these terms are nonrenormalizable interactions. In the language of Sec¬ 
tion 12.1, they are irrelevant to physics in four dimensions in the limit where 
the cutoff is taken to infinity. 


iTlie general systematics of P, C, and T violation are discussed in Section 20.3. 



486 


Chapter 15 Non-Abelian Gauge Invariance 


We have now reached a remarkable conclusion. We began by postulating 
that the electron field obeys the local symmetry (15.1). From this postulate, 
we showed that there must be an electromagnetic vector potential. Further, 
the symmetry principle implies that the most general Lagrangian in four di¬ 
mensions that is renormalizable (or relevant, in Wilson’s sense) is the general 
form £ 4 . If we insist that this Lagrangian also be invariant under time rever¬ 
sal or parity, we are led uniquely to the Maxwell-Dirac Lagrangian that is the 
basis of quantum electrodynamics. 

15.2 The Yang-Mills Lagrangian 

If the simple geometrical constructions of the previous section yield Maxwell’s 
theory of electrodynamics, then surely it must be possible to construct other 
interesting theories by starting with more general geometrical principles. Yang 
and Mills proposed that the argument of the previous section could be gener¬ 
alized from local phase rotation invariance to invariance under any continuous 
symmetry group. In this section, we will introduce this generalization of local 
symmetry. For most of the discussion, we will consider our local symmetry to 
be the three-dimensional rotation group, 0(3) or SU( 2), since in this case the 
necessary group theory should be familiar. At the end of the section, we will 
generalize further to the case of an arbitrary local symmetry. 

Consider, then, the following generalization of the phase rotation (15.1): 
Instead of a single fermion field, we start with a doublet of Dirac fields, 

*=(*«)• (151S) 

which transform into one another under abstract three-dimensional rotations 
as a two-component spinor: 

ip —> exp^a*^-^^, (15.20) 

Here o‘ are the Pauli sigma matrices, and, as usual, a sum over repeated 
indices is implied. It is important to distinguish this abstract transformation 
from a rotation in physical three-dimensional space; in their original paper, 
Yang and Mills considered {ipi,ip 2 ) to be the proton-neutron doublet as it is 
transformed under isotopic spin. As in the case of a phase rotation, it is not 
hard to construct Lagrangians for that are invariant to (15.20) as a global 
symmetry. 

We now promote (15.20) to a local symmetry, by insisting that the La¬ 
grangian be invariant to this transformation for a 1 an arbitrary function of x. 
Write this transformation as 

ip{x) —> V(x)ip{x), where V(x ) = expect* (re) -y) • (15.21) 
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We can construct a suitable Lagrangian by applying the methods of the previ¬ 
ous section. However, we will encounter a number of additional complications, 
due to the fact that there are now three orthogonal symmetry motions, which 
do not commute with one another. This feature is sufficiently important to 
earn a special name for theories that have it: We refer to the Abelian sym¬ 
metry group of electrodynamics, and the non-Abelian symmetry group of the 
more general theories. The field theory associated with a noncommuting local 
symmetry is termed a non-Abelian gauge theory. 

To construct a Lagrangian that is invariant under this new group of trans¬ 
formations, we must again define a covariant derivative that transforms in a 
simple way. Again we use the definition (15.4), but since ib is now a two- 
component object, the comparator U(y, x) must be a 2 x 2 matrix. The trans¬ 
formation law for U(y,x ) is now 

U(y,x) -+V(y)U(y,x)VHx), (15.22) 

where V(x) is as in (15.21), and again we set U(y,y ) = 1. At points x yf y 
we can consistently restrict U(y..r) to be a unitary matrix. Near U = 1, any 
such matrix can be expanded in terms of the Hermitian generators of SU(‘2 ); 
thus for infinitesimal separation we can write 

U(x + en, x) = 1 + igen^A^ — + (9(e 2 ). (15.23) 

Here g is a constant, extracted for later convenience. Inserting this expansion 
into the definition (15.4) of the covariant derivative, we find the following 
expression for the covariant derivative associated with local SU(2) symmetry: 

D^d^-igJV*-. (15.24) 

This covariant derivative requires three vector fields, one for each generator 
of the transformation group. 

We can find the gauge transformation law of the connection A* by insert¬ 
ing the expansion (15.23) into the transformation law (15.22): 

1 + ig< n 1 ' A 1 ,, ^— > V(x + en) ^1 + igen^A'^^-^j V * (x ). (15.25) 

We must expand the right-hand side to order e, taking care that the various 
Pauli matrices do not commute with one another. The expansion of V{x + en) 
is conveniently done using the identity 

O 

V(x + en)V^(x) = ^1 + en 11 —— + 0(e 2 ) j V(x)J (x) 

= 1 + en^^V(x))vHx) + 0(e 2 ) 

= 1 + en»V(x) (-JLvHx)) + 0(e 2 ). 


(15.26) 
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Then the terms proportional to en M in (15.25) give the transformation 


4(*)y ^ y + (15.27) 

The derivative acts on V^(x) = exp(—iaV/2); it is not so easy to compute 
this derivative explicitly because the exponent does not necessarily commute 
with its derivative. For infinitesimal transformations we can expand V(x) to 
first order in a. In this case we obtain 


G l ■ G l 1 G l 

^ Y ^- + -( 5X ) y + 


<[■ 


i\a 



(15.28) 


The last term in this transformation law is new, and arises from the noncom¬ 
mutativity of the local transformations. By combining this relation with the 
infinitesimal form of the fermion transformation, 


ip > ^1 + in' ^ i' 


+ 


(15.29) 


we can check the infinitesimal transformation of the covariant derivative: 
D u*l> -A (<9 m - igA ^y - y + g[a l y, y]) (l + 


= (' + /n 'y) 


(15.30) 


up to terms of order a 2 . It is not difficult to check using (15.27) and (15.21) 
that, even for finite transformations, the covariant derivative has the same 
transformation law as the field on which it acts. 

Using the covariant derivative, we can build the most general gauge- 
invariant Lagrangians involving ip. But to write a complete Lagrangian, we 
must also find gauge-invariant terms that depend only on A* . To do this, we 
construct the analogue of the electromagnetic field tensor. We will use the 
second method of the previous section, working from the commutator of co¬ 
variant derivatives. The transformation law of the covariant derivative implies 
that 

[D^D u ]ip{x) ->■ V{x)[D^D v ]iP{x). (15.31) 


At the same time, by writing out the commutator using formula (15.24), we 
can show, as in the Abelian case, that [D^, D v \ is not a differential operator 
but merely a multiplicative factor (now a matrix) acting on ip. This time, how¬ 
ever, there is a new feature: The last term in the expansion of the commutator 
no longer vanishes. Instead, we find 

[D„D v ] = -igF^ y (15.32) 





with 


(15.33) 
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We can simplify this relation by applying the standard commutation relations 
of Pauli matrices: 





(15.34) 


Then 


F^=d,Al-d,A‘ tl +ge^4At. 

The transformation law for the field strength follows from Eqs. 
and (15.31): 

y v(x)f^vHx). 


(15.35) 
(15.21) 

(15.36) 


The infinitesimal form is 





(15.37) 


Notice that the field strength is no longer a gauge-invariant quantity. It cannot 
be, since there are now three field strengths, each associated with a given 
direction of rotation in the abstract space. However, it is easy to form gauge- 
invariant combinations of the field strengths. For example, 


£ = -^tr [(F^)' 2 ]=-\(F^) 2 (15.38) 

is a gauge-invariant kinetic energy term for A 1 Notice that, in contrast to 
the case of electrodynamics, this Lagrangian contains cubic and quartic terms 
in A 1 Thus, this Lagrangian describes a nontrivial, interacting field theory, 
called Yang-Mills theory. This is the simplest example of a non-Abelian gauge 
theory. 

To construct a theory of Yang-Mills vector fields interacting with fermions, 
we simply add the gauge-field Lagrangian (15.38) to the familiar Dirac La¬ 
grangian, with the ordinary derivative of i[> replaced by the covariant deriva¬ 
tive. The result looks almost identical to the QED Lagrangian: 

C = - jmhp. (15.39) 

This is the famous Yang-Mills Lagrangian. Like that of QED, it depends on 
two parameters: the scale factor g (which is analogous to the electron charge) 
and the fermion mass m. By varying this Lagrangian, we find the classical 
equations of motion of the gauge theory. These are the Dirac equation for the 
fermion field and the equation 


o“ = -gih,^ ' 15,10 ! 

for the vector field. 

Everything that we have done for the SU( 2) symmetry transformation 
(15.20) generalizes easily to any other continuous group of symmetries. The 
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full range of possible symmetry groups is enumerated and classified in Sec¬ 
tion 15.4. For any such group, however, the general expressions for elements 
of the Lagrangian are quite similar. Consider any continuous group of trans¬ 
formations, represented by a set of n x n unitary matrices V. Then the basic 
fields ip(x) will form an n-plet, and transform according to 

ip(x) —1 V(x)ip(x), (15.41) 

where the x dependence of V makes the transformation local. In infinitesimal 
form, V(x) can be expanded in terms of a set of basic generators of the 
symmetry group, which can be represented as Hermitian matrices t a : 

V(x) = 1 + ia a (x)t a + 0(a 2 ). (15.42) 

Now one can carry through the whole analysis from Eq. (15.22) to Eq. (15.33) 
for a general local symmetry group simply by replacing 

(15.43) 

at each step of the analysis. 

To generalize the explicit expression (15.35) for the field tensor, we need 
to know the commutation relations of the matrices t a . It is conventional to 
write these in the standard form 


\t a ,t b ]=if abc t% (15.44) 

where f abc is a set of numbers called structure constants. This object replaces 
e‘-> k j n ( 15 . 34 ). it i s conventional to choose a basis for the matrices t a 
such that f abc is completely antisymmetric; we will prove that this is always 
possible in Section 15.4. 

We can now recapitulate all of our results as follows. The covariant deriva¬ 
tive associated with the general transformation (15.41) is 

Df, = d fl — igA c pt a ; (15.45) 

it contains one vector field for each independent generator of the local sym¬ 
metry. The infinitesimal tranformation laws for ip and are 

ip -> (1 + in' 1 1' 1 )t/s; 

1 , , (15.46) 

a; a; + -d tl a a + f abc Ay. 

The finite transformation of has exactly the form of (15.27): 

Al(x)t a -> V(x) (Al{x)t a + ^d^vHx). (15.47) 

These transformation laws imply that the covariant derivative of ip has the 
same transformation law as ip itself. The field tensor is defined by 

[D^Dvl = -igF“„t a , 


(15.48) 
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or more explicitly, 

F% = d.Al ~ d v Al + gf abc A\A%. (15.49) 

This quantity has the infinitesimal transformation 

FZ V ^F* V - r bc a b F' v . (15.50) 

From Eqs. (15.46) and (15.50), one can show that any globally symmetric 
function of i/i, and their covariant derivatives is also locally symmetric, 
and is therefore a candidate for a term in a gauge-invariant Lagrangian. How¬ 
ever, there are very few permissible terms up to dimension 4. The most general 
gauge-invariant Lagrangian that is renormalizable and conserves P and T is 
again given by Eq. (15.39). The corresponding classical equation of motion is 

+ gf abc A b »F' v = -gjt, (15.51) 

where 

jl=in v F$ (15.52) 

is the global symmetry current of the fermion field. 

Notice that the nonlinear terms in the Yang-Mills Lagrangian (15.39) 
appear in the covariant derivative, where they are proportional to t a , and in 
the field tensor, where they are proportional to f abc . Thus the form of the 
interactions in a non-Abelian gauge theory is dictated by the local symmetry. 
The nonlinear interactions of the vector field with itself are proportional to 
the commutators of symmetry generators and thus explicitly require the non- 
Abelian nature of the symmetry group. 


15.3 The Gauge-Invariant Wilson Loop 

In both of the previous sections we made use of the comparator , U(y, x ), which 
converts the fermion gauge transformation law at point x to that at point y. 
So far, in writing expressions for this object, it has sufficed to assume that x 
and y are infinitesimally separated. However, it is worthwhile to think further 
about the comparator in the case where x and y are far apart. This discussion 
will give us further insights into the geometry of gauge invariance, and will 
reveal some additional useful functions of the gauge field which we will put to 
work in Chapter 19. 

We first return to the Abelian theory and expand upon our discussion 
of U(y,x ) in that context. In Eq. (15.10) we constructed a product of com¬ 
parators on a path that wound around a small square. We showed that this 
product U(.'c) is not trivial, even though we eventually return to the starting 
point; rather, we found that U(x) differs from 1 by a term proportional to the 
electromagnetic field strength and to the area of the square. This is a partic¬ 
ular case of a general conclusion: The comparator between two points x and 
y at finite separation depends on the path taken from x to y. 



492 


Chapter 15 Non-Abelian Gauge Invariance 


To explain this statement, it is useful to reverse some of the logic of 
Section 15.1. We begin from the connection , which we assume to have 
the transformation law (15.7), and construct U(z,y ) as a function of A^ that 
transforms according to (15.3). It is not difficult to verify that the expression 

Up(z,y ) = exp[— ie J d;r''v4 ;j (;r)j (15.53) 

p 

meets this criterion if the integral is taken along any path P that runs from y 
to z. This object Up(z,y ) is called the Wilson line} Expression (15.53) gives 
an explicit realization of the abstract comparator U(z, y) for points at finite 
separation. 

A crucial property of the Wilson line is that it depends on the path P. If 
P is a closed path that returns to y, we obtain the Wilson loop , 

Up(y,y ) = exp ^—ie j dx^A^x) j. (15.54) 

p 

This quantity is a nontrivial function of A M that is, by construction, locally 
gauge invariant. In fact, all gauge-invariant functions of A^ can be thought of 
as combinations of Wilson loops for various choices of the path P. To motivate 
this claim, we use Stokes’s theorem to rewrite the Wilson loop as 

U P (y,y) = exp^-i^ J da ,tv j, (15.55) 

s 

where E is a surface that spans the closed loop P, da^ is an area element 
on this surface, and is the field tensor (15.13). This relation between the 
Wilson loop and the field strength is illustrated in Fig. 15.2. Since the Wilson 
loop is gauge invariant, this argument gives one more way to visualize the 
gauge invariance of the field strength. Conversely, since (almost) all gauge- 
invariant functions of A fl can be built up from F this expression gives 
weight to the statement that Up(y,y ) is the most general gauge invariant. 

Both the Wilson line and the Wilson loop can be generalized to the non- 
Abelian case. Here, however, additional subtleties arise when we consider ex¬ 
ponentials of noncommuting matrices. Let us first construct the Wilson line, 
which now transforms according to Eq. (15.22). It is not correct to make a 
straightforward rewriting of (15.53) with the integral of A“t a in the exponent, 
since these matrices do not necessarily commute at different points. Instead, 
we must order these matrices in a particular way. We will now give the correct 
ordering prescription and then prove its transformation law. 

Let s be a parameter of the path P, running from 0 at x = y to 1 at x = z. 
Then define the Wilson line as the power-series expansion of the exponential, 
with the matrices in each term ordered so that higher values of s stand to the 


* This path-dependent phase was used long before Wilson’s work, in Schwinger’s 
early papers on QED, and in Y. Aharonov and D. Bohm, Phys. Rev. 115, 485 (1959). 



15.3 The Gauge-Invariant Wilson Loop 493 


Figure 15.2. The Wilson loop integral is taken around an arbitrary loop. 

It can also be expressed as a flux integral of the field strength over a surface 
spanning the loop. 

left. This prescription is called path-ordering and is denoted by the symbol 
P{}. Thus the Wilson line is written 

l 

Up(z,y ) = P | exp \i9 j j. (15.56) 

o 

This expression is similar to the time-ordered exponential that we wrote for 
the interaction-picture propagator in Eq. (4.23). Pursuing this analogy, one 
can show that this expression for Up is the solution of a differential equation 
similar to (4.24): 

-^-Up{x(s),y) = (ig (,t (s) )t a ) Up(x(s), y). (15.57) 

(Here we consider Up to be a continuous function of the parameter s, rather 
than fixing s = 1 at the endpoint.) 

To show that expression (15.56) is the correct generalization of the Wil¬ 
son line, we must show that it satisfies the correct gauge transformation 
law (15.22). This follows from the differential equation (15.57), which can 
be rewritten as 

(jrp M 

— D tl U P (x,y)= 0. (15.58) 

as 

Now let A 1 represent the gauge transform of a field configuration A, and use 
these arguments to denote explicitly the dependence of gauge functions on 
the gauge field. We would like to show that 

U P (z,y,A v ) = V(z)U P ifey,A)vHy), (15.59) 

which is equivalent to (15.22). In (15.30) we proved, in its infinitesimal version, 
the relation 

Df,(A v ) V(x) = V(x) D tl (A). (15.60) 

This relation implies that the right-hand side of (15.59) satisfies (15.58) for 
the gauge field A 1 if Up(z,y,A) satisfies this equation for the gauge field A. 
But the solution of a first-order differential equation with a fixed boundary 
condition is unique. Thus, if Up(z,y) is defined to be the solution of (15.57) 
or (15.58), it indeed has the transformation law (15.59). 
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The Wilson line associated with a closed path returning to y transforms 
only with the gauge parameter at y\ however, it is not a gauge invariant: 

U P (y,y) -► V(y)Up(v,y)VHv). (15.61) 

To understand this transformation better, one can work out the expression 
for Up(x,x), where the path is the small square in the (1,2) plane shown 
in Fig. 15.1. In addition to the terms in Eq. (15.11), there are additional 
corrections of order e 2 coming from products of (A“i“) factors from pairs of 
sides, which sum up to a commutator of these factors. One finds 

U P (x,x) = 1 +ige 2 F?. 2 (x)t a + 0(e 3 ), (15.62) 

where is given by the full expression in (15.49). If we then expand the 
transformation law (15.61) in powers of e, the term of order e 2 is the trans¬ 
formation law of F^ v given in Eq. (15.36). 

To convert the Wilson line for a closed path into a true gauge invariant, 
take the trace. By cyclic invariance, (15.61) implies 

tr Up(x,x) —> tr Up(x, x). (15.63) 

Thus for a non-Abelian gauge theory, we define the Wilson loop to be the 
trace of the Wilson line around a closed path. 

Let us evaluate tr Up(x,x) more explicitly for the case of an SU( 2) gauge 
group. If U(e) is any 2x2 unitary matrix that tends to 1 as e —> 0, we can 
expand it in e as follows: 


U(e)=exp[i(e l 8’+e 2 Y+ ■■■)-] 

= 1 + i{ef3‘ + e" 7 ® -I )— - -(e/T • elF 4 )y^- + 


(15.64) 


Then, since the Pauli matrices are traceless and satisfy tr[cr*cr J "] = 2 S' 1 -', 

tr U(c) =2 - i e 2 (/T : ) 2 4- 0(e 3 ). (15.65) 

Applying this formula to Eq. (15.62), we find 

tr U P (x,x) =‘2-p 2 e i (Fl 2 ) 2 +0(e 5 ). (15.66) 

Thus the gauge invariance of (E^) 2 can be derived from a geometrical argu¬ 
ment, just as in the Abelian case. Using the notation that will be introduced 
in the next section, one can show that the same argument goes through for 
any gauge group. 
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At the end of Section 15.2 we saw that the class of non-Abelian gauge theories 
is very large. To work with these theories most efficiently, it is worthwhile to 
pause and consider the general properties of the continuous groups on which 
they are based. In this section we will enumerate all the possible groups that 
can be used to construct non-Abelian gauge theories. We will then compute 
some numerical factors, built out of group transformation matrices, that are 
needed in performing explicit calculations in quantized gauge theories.* 

To a mathematician, a group is made up of abstract entities that obey 
certain algebraic rules. In quantum mechanics, however, we are interested 
specifically in groups of unitary operators that act on the vector space of 
quantum states. We focus our attention on continuously generated groups, 
that is, groups that contain elements arbitrarily close to the identity, such 
that the general element can be reached by the repeated action of these in¬ 
finitesimal elements. Then any infinitesimal group element g can be written 

g(a) = 1 + in a T a + 0(a 2 ). (15.67) 


The coefficients of the infinitesimal group parameters a a are Hermitian oper¬ 
ators T a , called the generators of the symmetry group. A continuous group 
with this structure is called a Lie group. 

The set of generators T a must span the space of infinitesimal group trans¬ 
formations, so the commutator of generators T a must be a linear combination 
of generators. Thus the commutation relations of the operators T a can be 
written 

[T“, T b ] = if abc T c \ (15.68) 

the numbers f abc are called structure constants. The vector space spanned by 
the generators, with the additional operation of commutation, is called a Lie 
Algebra. 

The commutation relations (15.68) and the identity 

[T a ,[T b ,T c ]\ + [T b ,[T c ,T a ]\ + [T c , [T a , T b ]\ = 0 (15.69) 


imply that the structure constants obey 


(15.70) 


called the Jacobi identity. From the mathematician’s viewpoint (considering 
the generators to be abstract entities rather than Hermitian operators), the 


*In this section we will state, without proof, some general results from the theory 
of continuous groups. There are many excellent books that review these mathemati¬ 
cal results systematically. Among these, we recommend especially Calm (1984), for a 
brief but incisive discussion, and S. Helgason, Differential Geometry, Lie Groups, and 
Symmetric Spaces (Academic Press, 1978), which gives an elegant and rigorous ac¬ 
count. R. Slansky, Phvs. Repts. 79, 1 (1981), has compiled an especially useful set of 
tables of group-theoretic identities relevant to the construction of non-Abelian gauge 
theories. 
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Jacobi identity is an axiom that must be satisfied in order for a given set of 
commutation rules to define a Lie algebra. 

The commutation relations of the Lie algebra completely determine the 
group multiplication law of an associated Lie group sufficiently close to the 
identity. For large enough transformations, additional global questions come 
into play; to give a familiar example, SU(‘2 ) and 0(3) have the same com¬ 
mutation relations but different global structure. However, the Lagrangian 
of a non-Abelian gauge theory depends only on the Lie algebra of the local 
symmetry group, so we will ignore these global questions from here on. 

Classification of Lie Algebras 

For the application to gauge theories, the local symmetry is normally a uni¬ 
tary transformation of a set of fields. Thus we are primarily interested in 
Lie algebras that have finite-dimensional Hermitian representations, leading 
to finite-dimensional unitary representations of the corresponding Lie group. 
We will also assume that the number of generators is finite. Such Lie alge¬ 
bras are called compact , because these conditions imply that the Lie group is 
a finite-dimensional compact manifold. 

If one of the generators T a commutes with all of the others, it generates an 
independent continuous Abelian group. Such a group, which has the structure 
of the group of phase rotations 

if -a e ia if, (15.71) 

we call 17(1). If the algebra contains no such commuting elements, so that the 
group contains no (7(1) factors, then we call the algebra semi-simple. If, in 
addition, the Lie algebra cannot be divided into two mutually commuting sets 
of generators, the algebra is simple. A general Lie algebra is the direct sum of 
non-Abelian simple components and additional Abelian generators. 

Surprisingly, the basic conditions that a Lie algebra be compact and sim¬ 
ple turn out to be extremely restrictive. In one of the triumphs of nineteenth- 
century mathematics, Killing and Cart an classified all possible compact simple 
Lie algebras. Almost all of these algebras belong to one of three infinite fam¬ 
ilies, with only five exceptions. The three infinite families are the algebras 
corresponding to the so-called classical groups, whose structures are conve¬ 
niently defined in terms of particular matrix representations. The definitions 
of the three families of classical groups are as follows: 

1. Unitary transformations of N-dimensional vectors. Let f and g be com¬ 
plex IV-vectors. A general linear transformation then has the form 

'In t U abVb , fa t U ob f b . (15.72) 

We say that this transformation is unitary if it preserves the inner product 
The pure phase transformations 


6, -► e* a £ 


(15.73) 
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form a U(l) subgroup which commutes with all other unitary transformations; 
we remove this subgroup to form a simple Lie group, called Sl.'(.X): it consists 
of all N x N unitary transformations satisfying det (U) = 1. The generators 
of SU(N ) are represented by N x N Hermitian matrices t a , subject to the 
condition that they be orthogonal to the generator of (15.73): 

tr[i“]=0. (15.74) 

There are N 2 — 1 independent matrices satisfying these conditions. 

2. Orthogonal transformations of N-dimensional vectors. This is the sub¬ 
group of unitary N x N transformations that preserves the symmetric inner 
product 

VaEabtb, With E ab = S ab . (15.75) 

This is the usual vector product, and so this group is the rotation group in 
N dimensions, SO(N). (Adding the reflection gives the group 0(N).) There 
is an independent rotation corresponding to each plane in N dimensions, so 
SO(N ) has N(N — l)/2 generators. 

3. Symplectic transformations of N-dimensional vectors. This is the sub¬ 
group of unitary N x N transformations, for N even, that preserves the an¬ 
tisymmetric inner product 

VaE ab £ b , with E ab = ^ _° x ^ , (15.76) 

where the elements of the matrix are N/ 2 x N/2 blocks. This group is called 
Sp(N ); it has N(N + l)/2 generators. 

Beyond these three families, there are five more exceptional Lie algebras, 
denoted in Cartan’s classification system as Go, F 4 , Eg, E 7 , and Eg. Of these, 
Eg and Eg have been applied as local symmetry groups in interesting unified 
models of the fundamental interactions. However, we will not consider these 
exceptional groups further in this book. In fact, most of our examples will 
involve only SU(N ) groups. 

Representations 

Once we have specified the local symmetry group, the fields that appear in 
the Lagrangian most naturally transform according to a finite-dimensional 
unitary representation of this group. Thus we might next ask how to system¬ 
atically find all such representations of any given Lie group. Recall that for the 
group SU( 2), the representations can be constructed directly from the com¬ 
mutation relations, using the raising and lowering operators J+ and J_. This 
construction can be generalized to find the finite-dimensional representations 
of any compact Lie algebra. In this book, however, we will work with rela¬ 
tively simple representations whose structure we can work out by less formal 
methods. 
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Before discussing representations of Lie algebras, we should review some 
general aspects of group representations. Given a symmetry group G, a finite¬ 
dimensional unitary representation of the group’s Lie algebra is a set of d x d 
Hermitian matrices t a that satisfy the commutation relations (15.68). The 
size d is the dimension of the representation. An arbitrary representation can 
generally be decomposed by finding a basis in which all representation matri¬ 
ces are simultaneously block-diagonal. Through this change of basis, we can 
write the representation as the direct sum of irreducible representations. We 
denote the representation matrices in the irreducible representation r by 
It is standard practice to adopt a normalization convention for the ma¬ 
trices i“, based on traces of their products. If the Lie algebra is semi-simple, 
the matrices t°: themselves are traceless. Consider, however, the trace of the 
product of two generator matrices: 

tr [t a r t b r ] = D ab . (15.77) 

As long as the generator matrices are Hermitian, the matrix D ab is positive 
definite. Let us choose a basis for the generators T a so that this matrix is 
proportional to the identity. It can be shown that, once this is done for one 
irreducible representation, it is true for all irreducible representations. Thus, 
in this basis, 

tr[i“i‘] = C(r)S ab , (15.78) 

where C(r) is a constant for each representation r. Equation (15.78) and 
the commutation relations (15.68) yield the following representation of the 
structure constants: 

f abc = --^^{K,iK}- (15-79) 

This equation implies that f abc is totally antisymmetric. 

For each irreducible representation r of G, there is an associated conjugate 
representation r. The representation r yields the infinitesimal transformation 

<j> — > (1 -f ia a t°,)<p. (15.80) 

The complex conjugate of this transformation, 

4>* -> (l-ia a (t a r y)(j*, (15.81) 

must also be the infinitesimal element of a representation of G. Thus the 
conjugate representation to r has representation matrices 

= ~(t a r )* = ~(t°,) T . (15.82) 

Since <p*cp is invariant to unitary transformations, it is possible to combine 
fields transforming in the representations r and r to form a group invariant. 

It is possible that the representation f may be equivalent to r, if there is 
a unitary transformation U such that t% = Ut c ‘,U^. If so, the representation 
r is real. In this case, there is a matrix G a b such that, if g and £ belong to 
the representation r, the combination GahVa^b is an invariant. It is sometimes 
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useful to distinguish the case in which G a b is symmetric from that in which G a t, 
is antisymmetric. In the former case the representation is strictly real; in the 
latter case it is pseudoreal. Both cases occur already in SU( 2): The invariant 
combination of two vectors is v a w a , so the vector is a real representation; the 
invariant combination of two spinors is e a ' 8 ri a £,f 3 , so the spinor is a pseudoreal 
representation. 

With this language we can discuss the simplest representations of the 
classical groups. In SU(N), the basic irreducible representation (often called 
the fundamental representation) is the JV-dimensional complex vector. For 
N > 2 this representation is complex, so that there is a second, inequiva¬ 
lent, representation N. (In SU(‘2) this representation is the pseudoreal spinor 
representation.) In SO(N), the basic iV-dimensional vector is a (strictly) real 
representation. In Sp(N ), the iV-dimensional vector is a pseudoreal represen¬ 
tation. 

Another irreducible representation, present for any simple Lie algebra, is 
the one to which the generators of the algebra belong. This representation is 
called the adjoint representation and denoted by r = G. The representation 
matrices are given by the structure constants: 

(■ t b G )ac = if abc ■ (15.83) 


With this definition, the statement that fg satisfies the Lie algebra 

([^GifeOae = if hCd {tQ)ae (15.84) 

is just a rewriting of the Jacobi identity (15.70). Since the structure constants 
are real and antisymmetric, t,Q = — (i^)*; thus the adjoint representation is 
always a real representation. From the descriptions of the Lie groups given 
above, the dimension of the adjoint representation d(G) is given, for the clas¬ 
sical groups, by 

r N 2 - 1 for SU(N), 

d(G) = ^ N(N - l)/2 for SO(N ), (15.85) 

[N(N + l)/2 for Sp(N). 

The identification of f abc as a representation matrix allows us to gain 
further insight into some of the quantities introduced in Section 15.2. The 
covariant derivative acting on a field in the adjoint representation is 

(Dfi<p)a = d fl (f> a — igA (t G ) ac <j> c 

, , (15.86) 

= d^a+gf abc A b ^ c . 

Thus we can recognize the infinitesimal form of the gauge transformation of 
the vector field in (15.46) as the motion 

-+ A; + - g (D,ar. 


(15.87) 
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The gauge field equation of motion (15.51) can be rewritten as 

(D^F^r = -gj*. (15.88) 

In both of these expressions, the arbitrary-looking terms involving f abc arise 
naturally as part of a covariant derivative. An additional identity follows from 
considering the antisymmetric double commutator of covariant derivatives, 

e^[D v ,[D x ,D a }}. 

This quantity vanishes by its total antisymmetry, in the same way as (15.69). 
This result can be reduced to the identity 

e^ Xa (D v F\ a ) a = 0. (15.89) 

This equation, called the Bianchi identity of a non-Abelian gauge theory, is 
the analogue of the homogeneous Maxwell equations in electrodynamics. 


The Casimir Operator 

In SU{2), we characterize representations by the eigenvalue of the total spin 
J 2 . In fact, for any simple Lie algebra, the operator 

T 2 =T a T a (15.90) 


(with the repeated index summed, as always) commutes with all group gen¬ 
erators: 


jV^iO rj-yarj-Ul j _ ^ fbCiCrpC'jrj-ia | (j 

= if bac {T c ,T a }, 


(15.91) 


which vanishes by the antisymmetry of f abc . In other words, T 2 is an invariant 
of the algebra; this implies that T 2 takes a constant value on each irreducible 
representation. Thus, the matrix representation of T 2 is proportional to the 
unit matrix: 

t a r t a r =C 2 (r) ■ 1, (15.92) 

where 1 is the d(r) x d(r) unit matrix and C 2 {r) is a constant, called the 
quadratic Casimir operator, for each representation. For the adjoint represen¬ 
tation, Eq. (15.92) is more conveniently written as 

facdfcd = C 2 ( G j S ab (15.93) 


Casimir operators appear very often in computations in non-Abelian gauge 
theories. Furthermore, the related invariant C(r) given by (15.78) is simply 
related to the Casimir operator: If we contract (15.78) with S ab and evaluate 
the left-hand side using (15.92), we find 

d(r)C- 2 (r) = d(G)C(r). (15.94) 

Thus it will be useful for us to compute Cb(r) for the simplest SU(N) repre¬ 
sentations. 
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For SU( 2), the fundamental two-dimensional representation is the spinor 
representation, which is given in terms of Pauli matrices by 

*a = y ■ (15-95) 

These satisfy tr[fof|] = We will choose the generators of SU(N ) so that 
three of these are the generators (15.95), acting on the first two components 
of the A T -vector Then, for any matrices of the fundamental representation, 

tr[t%t b N ] = | S ab . (15.96) 

This convention fixes the values of C(r) and C 2 {r) for all of the irreducible 
representations of SU(N). For the fundamental representations N and N , 
C(N) is given directly by (15.96), and C 2 (N) follows from (15.94). We find 

C(N) = ± C 2 (N) = (15.97) 

To compute the Casimir operator for the adjoint representation, we build 
up this representation from the product of the N and N. Let us first discuss 
the product of irreducible representations more generally. The direct product 
of two representations n, r 2 is a representation of dimension d{r\) ■ d(r 2 ). An 
object that transforms according to this representation can be written as a 
tensor E pq , in which the first index transforms according to n, the second 
according to r 2 . In general, such a product can be decomposed into a direct 
sum of irreducible representations; symbolically, we write 

n x ro = r,;. (15.98) 

The representation matrices in the representation n x r 2 are 

'n 0 1 + 10 t a r2 , (15.99) 

where the first matrix of each product acts on the first index of E pq and the 
second matrix acts on the second index. 

The Casimir operator in the product representation is 

(tr 1 xr 2 f = ( f ri) 2 O 1 + 2O /+ 1 O (t“J 2 . 

Take the trace; since the matrices are traceless, the trace of the second 
term on the right is zero. Then 

tr {tf lXr2 ) 2 = (C 2 (n) + C 2 (r 2 ))d(n)d(r 2 ). (15.100) 

On the other hand, the decomposition (15.98) implies 

. , J' J = ^C 2 ( r i )d{r i ). (15.101) 

Equating (15.100) and (15.101), we find a useful identity for C 2 (r). 

Now apply this identity to the product of the N and N representations 
of SU(N). In this case, the tensor E pq can contain a term proportional to 
the invariant S pq . The remaining (iV 2 — 1) independent components of E pq 
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transform as a general traceless N x N tensor; the matrices that effect these 
transformations make up the adjoint representation of SU(N). In this case 
Eq. (15.98) becomes explicitly 

N x N = 1 + (N 2 - 1). (15.102) 

For this decomposition, Eqs. (15.100) and (15.101) imply the identity 

(2 • \ 2V ' ) .V- = 0 + C a (G) • (N 2 - 1). (15.103) 

Thus, for SU(N), 

C 2 {G) =C{G) = N. (15.104) 

Some additional examples of the computation of quadratic Casimir oper¬ 
ators are given in Problem 15.5. However, the examples we have discussed in 
this section, combined with the basic group-theoretic concepts that we have 
reviewed, already provide enough material to carry out the most important 
computations of physical interest in non-Abelian gauge theories. 


Problems 


15.1 Brute-force computations in SU(S). The standard basis for the fundamen¬ 
tal representation of 51/(3) is 


f 1 

f 4 

t: 6 




f 2 

t 5 

t 7 



— i 
0 
0 
0 
0 
0 
0 
0 



f 3 




0 

0 

-2 


(a) Explain why there are exactly eight matrices in the basis. 

(b) Evaluate all the commutators of these matrices, to determine the structure con¬ 
stants of SU( 3). Show that, with the normalizations used here, f ahc is totally 
antisymmetric. (This exercise is tedious; you may wish to check only a represen¬ 
tative sample of the commutators.) 

(c) Check the orthogonality condition (15.78), and evaluate the constant C(r) for 
this representation. 


(d) Compute the quadratic Casimir operator C 2 (r) directly from its definition 
(15.92), and verify the relation (15.94) between C 2 (r) and C(r). 


15.2 Write down the basis matrices of the adjoint representation of SU( 2). Compute 
C(G) and C 2 (G) directly from their definitions (15.78) and (15.92). 
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15.3 Coulomb potential. 

(a) Using functional integration, compute tlie expectation value of the Wilson loop 
in pure quantum electrodynamics without fermions. Show that 


< U P (z,z )) = exp 



dy 9nv 


1 

87t 2 (x — y) 2 


with x and y integrated around the closed curve P. 

(b) Consider the Wilson loop of a rectangular path of (spacelike) width R and 
(timelike) length T, T R. Compute the expectation value of the Wilson loop 
in this limit and compare to the general expression for time evolution, 


(Up) = exp [-iE(R)T], 

where E(R) is the energy of the electromagnetic sources corresponding to the 
Wilson loop. Show that the potential energy of these sources is just the Coulomb 
potential, V(R) = —e 2 /4nR. 

(c) Assuming that the propagator of the non-Abelian gauge field is given by the 
Feynman gauge expression 

<w4<*»=/ 

compute the expectation value of a non-Abelian Wilson loop to order g 2 . The 
result will depend on the representation r of the gauge group in which one 
chooses the matrices that appear in the exponential. Show that, to this order, the 
Coulomb potential of the non-Abelian gauge theory is V(R) = —g 2 C-2(r)/4nR. 

15.4 Scalar propagator in a gauge theory. Consider the equation for the Green’s 
function of the Klein-Gordon equation: 


( d 2 + m 2 )Dp(x,y) = —iS^ix — y). 


We can find an interesting representation for this Green’s function by writing 

oo 

D F (x,y) = jdT D(x,y,T), 

o 

where D(x,y,T ) satisfies the Schrodinger equation 

[w ~ I 92 +m2 )] / : = i5(T)S^(x - y). 

Now, represent D(x,y,T ) using the functional integral solution of the Schrodinger 
equation presented in Section 9.1. 

(a) Using the explicit formula of the propagator of the Schrodinger equation, show 
that this integral formula gives the standard expression for the Feynman prop¬ 
agator. 
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(b) Using the method just described, show that the expression 

oo 

D F (x,y ) = J dT J Vx exp i J dt^ ~ ie J 

0 

is a functional integral representation for the scalar field propagator in an arbi¬ 
trary background electromagnetic field. Show, in particular, that the functional 
integral satisfies the relevant Schrodinger equation. Notice that this integral de¬ 
pends on Ajj through the Wilson line. 

(c) Generalize this expression to a non-Abelian gauge theory. Show that the func¬ 
tional integral solves the relevant Schrodinger equation only if the group matrices 
in the exponential for the Wilson line are path-ordered. 

15.5 Casimir operator computations. An alternative strategy for computing the 
quadratic Casimir operator is to compute C(r) in the formula 

tr [t$t b r ] = C(r)5 ab 

by choosing t a and t b to lie in an 51/(2) subgroup of the gauge group. 

(a) Under an SU( 2) subgroup of a general group G, an irreducible representation r 
of G will decompose into a sum of representations of 51/(2): 

r 

where the ji are the spins of SU( 2) representations. Show that 
3C'(r) = ^i,:(i,: + l)(2j* + l). 

i 

(b) Under an 51/(2) subgroup of SU(N), the fundamental representation N trans¬ 
forms as a 2-component spinor (j = ^-) and (N — 2) singlets. Use this relation to 
check the formula C(N) = 4- Show that the adjoint representation of SU(N) 
decomposes into one spin 1, 2 (N — 2) spin-^’s, plus singlets, and use this de¬ 
composition to check that C(G) = N. 

(c) Symmetric and antisymmetric 2-index tensors form irreducible representations 
of SU(N). Compute C%{r) for each of these representations. The direct sum 
of these representations is the product representation N x N. Verify that your 
results for C^ir) satisfy the identity for product representations that follows 
from Eqs. (15.100) and'(15.101). 
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Quantization of Non-Abelian Gauge Theories 


The previous chapter showed how to construct Lagrangians with non-Abelian 
gauge symmetry. However, this is only the first step in the process of relating 
the idea of non-Abelian gauge invariance to the real interactions of particle 
physics. We must next work out the rules for computing Feynman diagrams 
containing the non-Abelian gauge vector particles, then use these rules to 
compute scattering amplitudes and cross sections. This chapter will develop 
the technology needed for such calculations. 

Alongside this technical discussion, we will study how the gauge symmetry 
affects the Feynman amplitudes. In any theory with a local symmetry, some 
degrees of freedom of the fields that appear in the Lagrangian are unphysical , 
in the sense that they can be adjusted arbitrarily by gauge transformations. 
In electrodynamics, the components of the field A fl (k) proportional to k M lie 
along the symmetry directions. We saw in Section 9.4 that this fact has two 
important consequences. First, the propagator of the field A M is ambiguous; 
there are multiple expressions for the propagator, which follow equally well 
from the QED Lagrangian. Second, the vertices of electrodynamics are such 
that this ambiguity makes no difference in the calculation of cross sections. For 
example, Eq. (9.58) displays a continuous family of photon propagators, one 
for each value of the continuous parameter but we saw immediately that all 
dependence of 5-matrix elements on £ is eliminated by the Ward identity. Non- 
Abelian gauge theories contain similar ambiguities and cancellations, but, as 
we will see in this chapter, the structure of the cancellations is more intricate. 

An additional goal of this chapter is to compute the Callan-Symanzik ,5 
function, and hence determine the behavior of the running coupling constant, 
for non-Abelian gauge theories. As discussed in Chapter 14, these theories 
are in fact asymptotically free : The coupling constant becomes weak at large 
momenta. This result indicates the applicability of non-Abelian gauge theory 
to model the strong interactions. We will be able to derive this result once we 
have determined the correct Feynman rules for non-Abelian gauge theories. 
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16.1 Interactions of Non-Abelian Gauge Bosons 

Most of the Feynman rules for non-Abelian gauge theory can be read directly 
from the Yang-Mills Lagrangian, following the method of Section 9.2. How¬ 
ever, when we quantized the electromagnetic field in Section 9.4, we saw that 
the functional integral over a gauge field must be defined carefully, and that 
the subtle aspects of this construction can introduce new ingredients into the 
quantum theory. In this section we will see how far we can go in the non- 
Abelian theory by ignoring these subtleties. In Section 16.2 we will carry out 
a more proper derivation of the Feynman rules, through a careful analysis of 
the functional integral. 

Feynman Rules for Fermions and Gauge Bosons 

The Yang-Mills Lagrangian, as derived in the previous chapter, is 

C = ~\{F“ V ) 2 + ^(ifl - m)ip, (16.1) 

where the index a is summed over the generators of the gauge group G, and 
the fermion multiplet ib belongs to an irreducible representation r of G. The 
field strength is 

- d " A l + 9f abc A b „Ai , (16.2) 

where f abc are the structure constants of G. The covariant derivative is defined 
in terms of the representation matrices t°, by 

Dfi = d fl — igA^t^- (16.3) 

From now on we will drop the subscript r except where it is needed for clarity. 

The Feynman rules for this Lagrangian can be derived from a functional 
integral over the fields tb, ib, and A“. Imagine expanding the functional integral 
in perturbation theory, starting with the free Lagrangian, at g = 0. The free 
theory contains of a number of free fermions equal to the dimension d(r) of 
the representation r, and a number of free vector bosons equal to the number 
d(G) of generators of G. Using the methods of Section 9.5, it is straightforward 
to derive the fermion propagator 

<**<«#,#<»)>=/ (It (16 - 4) 

where a, /3 are Dirac indices and i , j are indices of the symmetry group: 
i,j = 1,... ,d(r). In analogy with electrodynamics, we would guess that the 
propagator of the vector fields is 


(Al(x)A b „(y)) = /(0(-^) ' A - " 


(16.5) 
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Figure 16.1. Feynman rules for fermion and gauge boson vertices of a non- 
Abelian gauge theory. 

with a, 6 = 1,..., d(G). We will derive this formula in the next section. 

To find the vertices, we write out the nonlinear terms in (16.1). If Co is 
the free field Lagrangian, then 

c = c 0 + t/,iy% A /v - t/.r'"((i / ..i;().i ,; ' , .i A ' 

- y 2 (f eab A a K A b x )(f cd A KC A Xd )„ 

The first of the three nonlinear terms gives the fermion-gauge boson vertex 

igY‘t a - (16.7) 

this is a matrix that acts on the Dirac and gauge indices of the fermions. The 
second nonlinear term leads to a three gauge boson vertex. To work out this 
vertex, we first choose a definite convention for the external momenta and 
Lorentz and gauge indices. A suitable convention is shown in Fig. 16.1, with 
all momenta pointing inward. Consider first contracting the external gauge 
particle with momentum k to the first factor of .4“, the gauge particle with 
momentum p to the second, and the gauge particle of momentum q to the 
third. The derivative contributes a factor (— ik K ) if the momentum points into 
the diagram. Then this contribution is 

-igf abc {-iV)g^. (16.8) 

In all, there are 3! possible contractions, which alternate in sign according to 
the total antisymmetry of f abc . The sum of these is exhibited in Fig. 16.1. 
Finally, the last term of (16.6) leads to a four gauge boson vertex. Following 
the conventions of Fig. 16.1, one possible contraction gives the contribution 

- ig 2 f eab f ecd g blp g va ■ (16.9) 
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There are 4! possible contractions, of which sets of 4 are equal to one another. 
The sum of these contributions is shown in Fig. 16.1. 

Notice that all of these vertices involve the same coupling constant g. 
We derived the vertices, and thus the equality of the coupling constants, as a 
part of our construction of the Lagrangian from the principle of non-Abelian 
gauge invariance. However, it is also possible to see the need for this equality 
a posteriori , from the properties of Feynman amplitudes. 


Equality of Coupling Constants 

One property that we expect from Feynman amplitudes in non-Abelian gauge 
theories is that they should satisfy Ward identities similar to those of QED. 
These Ward identities express the conservation of the symmetry currents, 
which follows already from the global symmetry of the theory. In QED, the 
simplest form of the Ward identity was obtained by putting external electrons 
and positrons on shell. In non-Abelian gauge theories, the gauge bosons also 
carry charge and so these must also be put on shell to remove contact terms. 
With all external particles on shell, the amplitude for production of a gauge 
boson should obey 


(16.10) 


This identity is not only an indication of the local gauge symmetry, but is 
physically important in its own right. Like the photon, the non-Abelian gauge 
boson has only two physical polarization states. In QED, the on-shell Ward 
identity expressed the fact that the orthogonal, unphysical polarization states 
are not produced in scattering processes. The on-shell Ward identity will play 
a similar role in the non-Abelian case. 

Let us check the Ward identity in a simple case, the lowest-order diagrams 
contributing to fermion-antifermion annihilation into a pair of gauge bosons. 
In order g 2 , there are three diagrams, shown in Fig. 16.2. The first two dia¬ 
grams are similar to the QED diagrams that we studied in Section 5.5; they 
sum to 


iM^ 2 el(ki)el{k- 2 ) = {ig)' 2 v{p + )^t a ^ ^ t b 

+ 7 v t b ~n - \ -7 ,l t a }u(p) e* (fci)e*(fe). 

If 2 P+ m J 

(16.11) 

The vectors e(kj) are the gauge boson polarization vectors; for physical polar¬ 
izations, these satisfy k'/t ^{k; ! = 0. To check the Ward identity (16.10), we 
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Figure 16.2. Diagrams contributing to fermion-antifermion annihilation to 
two gauge bosons. 


replace e*(fco) in (16.11) by k 2v - This gives 
= (ig)' 2 v(p+){Yt a - 


i — #9 — m 


+ w 


¥■2 A m 


y‘t a ^u{p) i 


(16.12) 


ip- 


Since 


(y — m)u(p) = 0 and v(p+)(—tf+ ~ nr) = 0, (16.13) 

we can add these quantities to ¥2 i n the first and second terms of (16.12), to 
cancel the denominators. This gives 

iM \, V ? 6 *x»k- 2v = (ig) 2 v(p + )^-iY'[t a ,t h ]^u(p) e* lfl , (16.14) 

In the Abelian case, this expression would vanish. In the non-Abelian case, 
however, the residual term is nonzero and depends on the commutator of 
gauge group generators: 

= -g 2 v(p+h l ‘u(p ) 4u. ■ f abc t c . (16.15) 

We need to find another contribution to cancel this term. Notice, however, 
that this term has the group index structure of a fermion-gauge boson vertex 
(gi >l t c ) multiplied by a three gauge boson vertex (gf abc ). This is just the 
structure of the third diagram in Fig. 16.2. 

To check that the cancellation works, let us evaluate the third diagram: 

= igv{p+h P t c u{p ) e* (fci)e* (k 2 ) 

x gf abc [g^(k 2 - hr+ g v "(k 3 - k 2 Y + g p Yh - k 3 ) v ], 

with ks = —k\ — k- 2 . If we replace e^(k 2 ) with k 2u , then eliminate k 2 using 
momentum conservation, the expression in brackets simplifies as follows: 

e*Jk 2 ) [g^(k -2 - hY + g^ih - k 2 Y + g pp (h - k 3 y] 

->■ k»(k 2 - hY + Kiki - koY + g pp (h - k 3 ) ■ k -2 

— n pp lc 2 — lc p lc p — n pp h 2 -I- lc p h p 

— y ^3 ^3^3 y a - i -1-«/!«/!. 


(16.16) 
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Figure 16.3. Diagrams contributing to gauge boson-gauge boson scattering. 

Let us assume that the other gauge boson, with momentum k \, is on shell 
(k'i = 0), and that it has transverse polarization (k^e^iki) = 0). Then the 
third and fourth terms in the last line vanish. Furthermore, the term k^k^ 
vanishes when it is contracted with the fermion current. In the remaining 
term, the factor cancels the gauge boson propagator, and we are left with 

iM^ e* lfl k 2v = +g 2 v(p+)Y l u(p) e* lfl ■ f abc t c , (16.17) 

which precisely cancels (16.15). 

Notice that this cancellation takes place only if the value of the coupling 
constant in the three-boson vertex is identical to that in the fermion-boson 
vertex. In a similar way, the Ward identity cannot be satisfied among the di¬ 
agrams for boson-boson scattering, shown in Fig. 16.3, unless the coupling 
constant g in the four-boson vertex is identical to that in the three-boson ver¬ 
tex. Thus, the coupling constants of all three nonlinear terms in the Yang-Mills 
Lagrangian must be equal in order to preserve the Ward identity and avoid 
the production of bosons with unphysical polarization states. Conversely, the 
non-Abelian gauge symmetry guarantees that these couplings are equal. The 
symmetry thus accomplishes exactly what we hoped it would in our discus¬ 
sion at the beginning of Chapter 15, giving us a consistent theory of physical 
vector particle interactions. 

A Flaw in the Argument 

The preceding argument has one serious deficiency. At the final stage, we 
needed to assume that the second gauge boson was transverse. However, one 
might have expected that this information would come out of the argument 
rather than having to be put in. In QED, the Feynman diagrams predict 
that, when an electron and a positron annihilate to form two photons, only 
the physical transverse polarization states of the photons are produced. Am¬ 
plitudes to produce other photon polarizations cancel each other to yield zero, 
as we saw in Eq. (5.80). This statement is not true for the non-Abelian gauge 
theory Feynman rules that we have worked with so far. 

To state the discrepancy more concretely, we introduce some notation. 
Let k fl = (fc°,k) be a lightlike vector: k 2 = 0. Then there are two purely 
spatial vectors orthogonal to k. If k is the momentum of a vector boson, these 
are the two transverse polarizations. To construct an orthogonal basis, we 
must include also the longitudinal polarization state, with polarization vector 
parallel to k, and the timelike polarization state. It is most convenient to work 
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with the two lightlike linear combinations of these states, with polarization 
vectors parallel to the vectors W and ¥’ = (k 0 ,— k). These two unphysical 
polarization states of a massless vector particle can be written as follows: 



k° k V 

%/2|kf v^iki;’ 




k° k \ 
v^|k|’“v^|k|J‘ 


(16.18) 


We will refer to e + (k ) and e~(k ) as the forward and backward lightlike po¬ 
larization vectors. Denote the two transverse polarization states if (k), for 
i = 1, 2. These four polarization vectors obey the orthogonality relations 


ef = Sn, 

c+'G _ ( c -\2 _ 


ef = e ■ ef = 0, 


(e+)- = (e - )' = 0, e + e~ = 1. 

They also satisfy the completeness relation 


g,,M - e M e+* + e+e„ * - f 


e T e T *. 

ill iv 


(16.19) 


(16.20) 


Using this notation, we can express concretely the gap in the argument 
for the Ward identity. The Feynman diagrams of Fig. 16.2 apparently pre¬ 
dict that there is a nonzero amplitude to produce a forward-polarized gauge 
boson together with a backward-polarized gauge boson. For this case, we sub¬ 
stitute e~*(ki) and e+*(fco) for the two polarization vectors. Then the term 
proportional to in Eq. (16.16) no longer vanishes; it now yields 

iM = igv(p + )j p t c u(p ) e~*(h) ■ ' 9f abc [~ 

= igv(p + ) 7p t c u(p ) • {~g)f abc K ■ j^. 

Can we simply ignore this totally unphysical process? We are free to 
do so in calculations of leading-order amplitudes, but the process will come 
back to haunt us in loop diagrams. Recall from Section 7.3 how the optical 
theorem (7.49) links the imaginary part of a loop diagram to the square of a 
corresponding scattering amplitude, obtained by cutting the diagram across 
the loop. If we apply the optical theorem to the diagram shown in Fig. 16.4, 
we obtain a paradox. In the gauge boson loop on the left-hand side we can 
replace the g pv factors in the propagators with sums over all four polarization 
vectors (16.20). The theorem thus implies that all four polarizations, even 
the unphysical ones, should be included for the final-state gauge bosons on 
the right-hand side. We are faced with a choice of allowing the production of 
unphysical states or violating the optical theorem. A third alternative, equally 
unattractive, would be to discard our expression (16.5) for the gauge boson 
propagator. Clearly, we are missing some crucial element of the quantum- 
mechanical structure of non-Abelian gauge theories. 


KK] 

(16.21) 
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Figure 16.4. A paradox for the optical theorem in gauge theories. 

16.2 The Faddeev-Popov Lagrangian 

It is not surprising that we have found a problem with our Feynman rules for 
non-Abelian gauge theories, since we were not very careful in deriving them. 
In particular, we did not actually derive expression (16.5) for the gauge field 
propagator. In this section we will remedy this by going through a formal 
derivation of this expression. We will find that, although expression (16.5) is 
indeed correct, it is incomplete: It must be supplemented by additional rules 
of a completely new type. 

To define the functional integral for a theory with non-Abelian gauge 
invariance, we will use the Faddeev-Popov method, as introduced in Section 
9.4 to quantize the electromagnetic field. Our present discussion will follow 
Section 9.4 closely. However, as we have by now come to expect, the case of 
non-Abelian local symmetry brings with it new tricks and surprises. 

First consider the quantization of the pure gauge theory, without fermions. 
To derive the Feynman rules, we must define the functional integral 

jVA exp i Jd 4 x(-\(F^ . (16.22) 

As in the Abelian case, the Lagrangian is unchanged along the infinite number 
of directions in the space of field configurations corresponding to local gauge 
transformations. To compute the functional integral we must factor out the 
integrations along these directions, constraining the remaining integral to a 
much smaller space. 

As in electrodynamics, we will constrain the gauge directions by apply¬ 
ing a gauge-fixing condition G(A) = 0 at each point x. Following Faddeev 
and Popov, we can introduce this constraint by inserting into the functional 
integral the identity (9.53): 

1 = JVa{x) S(G(A a )) det( <5G ^ Q ^ > ). (16.23) 

Here A a is the gauge field A transformed through a finite gauge transformation 
as in (15.47): 

(.T’);,/'' = e ia ° ta [A h / + jO„[i . 


(16.24) 
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In evaluating the determinant, the infinitesimal form of this transformation 
will be more useful: 

(A a )l = A% + l -d,a a + r b <Ay = Al + - g D,a\ (16.25) 

where D M is the covariant derivative (15.86) acting on a field in the adjoint 
representation. Note that, as long as the gauge-fixing function G(A ) is linear, 
the functional derivative SG(A a )/Sa is independent of a. 

Since the Lagrangian is gauge invariant, we can replace A by A a in the 
exponential of (16.22). Then, as in the Abelian case, we can interchange the 
order of the functional integrals over A and a, and then change variables in 
the inner integral from A to A' = A a . The transformation (16.24) looks more 
complicated than in the Abelian case, but it is nothing more than a linear shift 
of the A“, followed by a unitary rotation of the various components of the 
symmetry multiplet A“(x) at each point. Both of these operations preserve 
the measure 

VA = nil cL4“. (16.26) 

x a,fi 


Thus VA = VA 1 , under the integral over a. Just as in the Abelian case, the 
integral over gauge motions a can be factored out of the functional integral 
into an overall normalization, leaving us with 

JVA e ,:S! M = (^j Vc^j jVA e iS A] S(G{A)) det • (16.27) 

This normalization factor cancels in the computation of correlation functions 
of gauge-invariant operators. 

^From this point, the derivation of the gauge boson propagator proceeds 
as for the photon propagator. We choose the generalized Lorentz gauge con¬ 
dition 

G(A) = 8M£(a;) - uj a (x), (16.28) 

with a Gaussian weight for c o a as in Eq. (9.56). The manipulations of Section 
9.4 then lead to the class of gauge field propagators 

(AlWA'M) = J -0-, (to - (16.29) 


with a freely adjustable gauge parameter £. Our guess (16.5) corresponds to 
the choice £ = 1, called the Feynman- ’t Hooft gauge. 

So far, this whole derivation parallels the case of electrodynamics. Here, 
however, there is one more nontrivial ingredient. In QED, the determinant in 
Eq. (16.23) was independent of A, so this quantity could be treated as just 
another contribution to the normalization factor. In the non-Abelian case this 
is no longer true. Using the infinitesimal form (16.25) of the gauge transfor¬ 
mation, we can evaluate 


6G(A a ) 


= -d»D 
9 


Mi 


8a 


(16.30) 
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acting on a field in the adjoint representation; this operator depends on A. 
The functional determinant of (16.30) thus contributes new terms to the La- 
grangian. 

Faddeev and Popov chose to represent this determinant as a functional 
integral over a new set of anticommuting fields belonging to the adjoint rep¬ 
resentation: 

det^-d^D^ = j X>cX>cexp|^ jd 4 x c(— d^D^c) J. (16.31) 

We derived this formal identity in Eq. (9.69), using our rules for fermionic 
functional integrals. (The factor of l/g is absorbed into the normalization of 
the fields c and c.) But to give the correct identity, c and c must be anticom¬ 
muting fields that are scalars under Lorentz transformations. The quantum 
excitations of these fields have the wrong relation between spin and statistics 
to be physical particles. However, we can nevertheless treat these excitations 
as additional particles in the computation of Feynman diagrams. These new 
fields and their particle excitations are called Faddeev-Popov ghosts. 

If we temporarily suppress our curiosity about the physical interpretation 
of the ghosts, we can work out their Feynman rules. We write the ghost 
Lagrangian more explicitly as 

£ g host = c a {-d 2 S ac - gd"f ab ^Al)c c . (16.32) 

The first term gives a ghost propagator, 

<■ c a (x)c b (y )) = J j£L±8 ab e- ik -<*-yl (16.33) 

In a diagram, this propagator carries an arrow that shows the flow of ghost 
number, as in Fig. 16.5. In the interaction term of (16.32), the derivative 
stands to the left of the gauge field; this implies that this derivative is evalu¬ 
ated with the momentum coming out of the vertex along the ghost line. The 
explicit Feynman rule is shown in Fig. 16.5. As with the other vertices we 
have encountered, the coupling constant g that appears in this vertex must 
be equal to the coupling constant g in the three-boson vertex in order to avoid 
upsetting the Ward identities. 

There are no further subtleties in the construction of the perturbation 
theory for non-Abelian gauge theories. In particular, it is straightforward to 
include fermions. The final Lagrangian, including all of the effects of Faddeev- 
Popov gauge fixing, is 

£ = + y^ a D' 2 + $00 ~ + c a {-d»D a ;y. (ie.34) 

This Lagrangian leads to the propagator (16.29), and to the set of Feynman 
rules for vertices shown in Figs. 16.1 and 16.5. 

The argument we have just completed suffices to derive the Feynman 
diagram expansion of any correlation function of gauge-invariant operators in 
a non-Abelian gauge theory. At the end of Section 9.4, we explained that the 
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Figure 16.5. Feynman rules for Faddeev-Popov ghosts. 

Faddeev-Popov gauge-fixing technique also gives the correct gauge-invariant 
expressions for 5-matrix elements. This remains true in the non-Abelian case. 
However, the argument given in Section 9.4 relied upon the cancellation in 
QED of the emission probabilities for timelike and longitudinal photons, and 
we have already found that this cancellation does not go through in the non- 
Abelian case. In Section 16.4 we will construct a more sophisticated argument, 
in which the Faddeev-Popov ghosts play an essential role, that will correctly 
generalize our previous argument to non-Abelian gauge theories. 


16.3 Ghosts and Unitarity 

We might now ask whether the new ingredients that we found in the previous 
section, the Faddeev-Popov ghosts, can resolve the paradox that we encoun¬ 
tered at the end of Section 16.1. There we saw that the first diagram in 
Fig. 16.6 contains a nonzero contribution to its imaginary part that does not 
correspond to a possible final state with physical gauge boson polarizations. 
We will now compute this contribution more carefully. We must then add a 
new potential contribution from the ghosts, shown as the second diagram in 
Fig. 16.6. 

Let us call the amplitude for fermion-fermion annihilation into gauge 
bosons, which we studied in Section 16.1, 

iM^e;(k rKffe); (16.35) 

the amplitude for two gauge bosons to convert to a fermion-antifermion pair 
will be, correspondingly, M'. Then, following the Cutkosky rules of Sec¬ 
tion 7.3, we find the imaginary part of the first diagram in Fig. 16.6 by- 
replacing the cut gauge boson propagator with momentum kj by 

-ig»v ■ (-27 ri)S(k-). (16.36) 

Replacing both propagators gives two delta functions, turning the four¬ 
dimensional integrals over the gauge boson momenta into three-dimensional 
phase space integrals, as in the example in Section 7.3. We are thus left with 
the expression 


h(iMng,p9»AiM'n, 


(16.37) 
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Figure 16.6. The diagram on the left, in which each circle represents the 
sum of the three contributions of Fig. 16.2, gives a possible problem for the 
optical theorem. The ghost diagram on the right cancels the anomalous terms. 


integrated over the phase space of two massless particles. The factor 1/2 is 
a symmetry factor for the Feynman diagram or, equivalently, a correction to 
the phase space integral for identical particles. 

Now introduce the representation (16.20) for g pp and g va . The pieces that 
involve only transverse polarizations correspond to the expected imaginary 
parts necessary to satisfy the optical theorem. We need not consider these 
terms further. The cross terms between physical and unphysical polarizations 
vanish: We showed in Section 16.1 that 


iM^e^ik^e+^kn) = 0. (16.38) 

The same identity holds if M is replaced by M', and if e + is replaced by e~. 
Furthermore, the amplitude vanishes if both polarization vectors are forward 
or both are backward. The only surviving terms are the cross terms between 
forward and backward polarization, which yield the expression 


1 

2 


[(iM 


+ (iM 


+* — * 
n 


)(iM' p<T e p e +)], 


(16.39) 


integrated over phase space. We worked out the value of the first factor in 
Eq. (16.21), and the contraction with M' is very similar. Substituting these 
results, expression (16.39) becomes 


{igv(p+)lnt e u(p) 


(h+k 2 y 

X ( igu(p')lpt d v(p' + ) 


( -gf abc K )) 


(ki+h 2 f 


(~gf abd (-k-2) p )) +(ki **fc 2 ). 

(16.40) 


Using the identity 


v(p+) %{ki + k 2 ) p u(p) = v(p + y/ p (p +p + ) l, u(p) = 0, (16.41) 


we see that the two terms added in (16.40) are equal. 

Now add the contribution from the Faddeev-Popov ghosts. Using the 
Feynman rules in Fig. 16.5, we can assemble the amplitude for fermion- 
antifermion annihilation into a pair of ghosts: 


—I 

(h + k 2 y 


■ (-gf abc K ). 


iXghost = igv(p + Yi P t c u(p) ■ 


(16.42) 
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This is precisely the first half of expression (16.40). Similarly, the amplitude 
for the ghost-antighost pair to annihilate into fermions is equal to the second 
half of (16.40). Finally, since Faddeev-Popov ghost fields anticommute, we 
must supply a factor of —1 for each ghost loop. Thus the ghost contribution 
exactly cancels the contribution of unphysical gauge boson polarizations to 
the Cutkosky cut of the diagrams in Fig. 16.6. 

This example illustrates a general physical interpretation of Faddeev- 
Popov ghosts. These “particles” serve as negative degrees of freedom to cancel 
the effects of the unphysical timelike and longitudinal polarization states of 
the gauge bosons. The simplest effect of the ghosts can already be seen from 
the determinants that appear when one integrates over the gauge and ghost 
fields in the Faddeev-Popov Lagrangian (16.34). In a general dimension d, 
working in Feynman gauge and at zero coupling for simplicity, the functional 
integral over the gauge and ghost fields in (16.34) yields 

(det[—<9 2 ]) -d/2 • (det[—<9 2 ]) +1 . (16.43) 

The second determinant, which appears with a positive exponent because the 
ghost fields anticommute, cancels the contribution to the first determinant of 
two components of the field A fl . This physical effect was illustrated, using the 
language of Section 9.4, in Problem 9.2. 


16.4 BRST Symmetry 

To show how this cancellation extends to the complete interacting theory, 
Becchi, Rouet, Stora, and Tyutin introduced as a beautiful formal tool a new 
symmetry of the gauge-fixed Lagrangian (16.34), which involves the ghost in 
an essential way.* This BRST symmetry has a continuous parameter that is 
an anticommuting number. To write the symmetry in its simplest form, let 
us rewrite the Faddeev-Popov Lagrangian by introducing a new (commuting) 
scalar field B a : 

C, = ~^(FP 2 +^(ip-m)i/;-^(B a y 2 +B a d> l A a ll + -c a {-d^Dl c }c c . (16.44) 

The new field B a has a quadratic term without derivatives, so it is not a 
normal propagating field. The functional integral over B a can be done by 
completing the square in (16.44); this procedure brings us back precisely to 
Eq. (16.34). A field of this type, which appears in the functional integral but 
has no independent dynamics, is called an auxilliary field. 


*C. Becchi, A. Rouet, and R. Stora, Ann. Phvs. 98, 287 (1976); I. V. Tyutin, 
Lebedev Institute preprint (1975, unpublished); M. Z. Iofa and I. V. Tyutin, Theor. 
Math. Phvs. 27, 316 (1976). 
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Now let e be an infinitesimal anticommuting parameter, and consider the 
following infinitesimal transformation of the fields in (16.44): 

8A a u = tD a fc c 

Sip = ig< c"f"r 

Sc a = -\gef abc c b c c (16.45) 

Sc a = eB a 
8B a = 0 . 

The transformation of the fields and tp is a local gauge transformation 
whose parameter is proportional to the ghost field: a a {x) = gec a (x). Thus, 
the first two terms of (16.44) are invariant to (16.45). The third term is triv¬ 
ially invariant. The transformation of in the fourth term cancels the trans¬ 
formation of c a in the last term. Finally, we must examine the transformation 
of the last ingredient in (16.44): 

8(D* c c c ) = d;<Sc c + gf abc 8A b tl c c 

= -hgtd tl (f abc c b c c ) - \g 2 ej abc j cde A\c d c e (16.46) 

+ gef abc (d f ,c b )c c + g 2 ef abc f bde A d c e c c . 

The two terms of order g manifestly cancel. By using the anticommuting 
nature of the ghost fields and exchanging the names of indices, we can write 
the remaining two terms as 

~y 2 f abc f cde {A b fl c d c e + A d c e c b + A e fl c b c d ), (16.47) 

which vanishes by the Jacobi identity (15.70). Apparently, the BRST trans¬ 
formation (16.45) is a global symmetry of the gauge-fixed Lagrangian (16.44), 
for any value of the gauge parameter 

The BRST transformation has one more remarkable feature, which is 
a natural consequence of its anticommuting nature. Let Qcp be the BRST 
transformation of the field <j>: Sep = eQ<p. For example, QA^ = F>“ c c c . Then, 
for any field, the BRST variation of Q<p vanishes: 

Q 2 4> = 0. (16.48) 

The vanishing of (16.46) proves this identity for the second BRST variation 
of the gauge field. For the ghost field, 

QV = \g 2 f abc f bde c c c d c e , (16.49) 

which vanishes by the Jacobi identity. It is straightforward to check that the 
second BRST variations of the other fields in (16.44) also vanish. 

To describe the implications of identity (16.48), we now consider studying 
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the effective theory (16.44) in the Hamiltonian picture after canonical quan¬ 
tization. Because the Lagrangian has the continuous symmetry (16.45), the 
theory will have a conserved current, and the integral of the time component 
of this current will be a conserved charge Q that commutes with H. The ac¬ 
tion of Q on field configurations will be just that described in the previous 
paragraph. The relation (16.48) is equivalent to the operator identity 

Q 2 = 0. (16.50) 

We say that the BRST operator Q is nilpotent. 

A nilpotent operator that commutes with H divides the eigenstates of 
H into three subspaces. Many eigenstates of H must be annihilated by Q so 
that (16.50) can be satisfied. Let Hi be the subspace of states that are not 
annihilated by Q. Let H 2 be the subspace of states of the form 

life) = Q , (16.51) 

where \ipi) is in Hi. According to (16.50), acting Q again on these states gives 
zero. Finally, let Ho be the subspace of states | ipo) that satisfy Q \ipo) = 0 but 
that cannot be written in the form (16.51). The subspace H 2 is quite peculiar, 
because any two states in this subspace have zero inner product: 

(W-2aVhb) = {Vla\Q\V2b) = 0 (16.52) 

by (16.50). By the same argument, the states of H 2 have zero inner product 
with the states of Ho■ 

These considerations seem extremely abstract, but they have a direct 
physical correspondence. 1 ' To see this, consider single-particle states of the 
non-Abelian gauge theory in the limit of zero coupling. According to the 
transformation (16.45), Q converts the forward component of A“ to a ghost 
field; equivalently, Q converts a single forward-polarized gauge boson to a 
ghost. At g = 0, Q annihilates the one-ghost state. At the same time, Q 
converts the antighost state to a quantum of B a . To identify this state, note 
that the Lagrangian (16.44) implies the classical field equation 

£fl“=0M“. (16.53) 

Thus the quanta of the field B a are those quanta of A“ with polarization 
vectors such that k ,l e tl (k) yf 0; these are the backward-polarized gauge bosons. 

We have now seen that, among the single-particle states of the gauge 
theory, forward gauge bosons and antighosts belong to Hi , ghosts and back¬ 
ward gauge bosons belong to 7fo, and transverse gauge bosons belong to Ho- 
More generally, it can be shown that asymptotic states containing ghosts, 


fTlie following argument is presented only at an intuitive level. For a rigorous 
discussion, see T. Kugo and I. Ojima, Prog. Theor. Phvs. 66, 1 (1979). 



520 


Chapter 16 Quantization of Non-Abelian Gauge Theories 


antighosts, or gauge bosons of unphysical polarization always belong to H\ or 
H 2 , while the asymptotic states in Ho are those with only transversely polar¬ 
ized gauge bosons. The BRST operator thus gives a precise relation between 
the unphysical gauge boson polarization states and the ghosts and antighosts 
as positive and negative degrees of freedom. 

In Section 9.4, we argued that the Faddeev-Popov prescription gave the 
correct, gauge-invariant result for a certain subclass of 5-matrix elements, 
from which we could compute the physical scattering cross sections of trans¬ 
versely polarized gauge bosons. These 5-matrix elements were constructed 
by putting operators in the far past to create transversely polarized gauge 
bosons, adiabatic-ally turning on the gauge coupling, adiabaticallv turning off 
the gauge coupling, and then placing operators in the far future to annihi¬ 
late gauge bosons with transverse polarization. However, this argument had a 
possible problem: If the states created as collections of transversely polarized 
bosons in the far past could evolve into states that contained gauge bosons of 
other polarizations in the far future, the 5-matrix projected between trans¬ 
verse gauge boson states would not be unitary. This problem would also lead 
to the technical problem discussed in the previous section: The Cutkosky cuts 
of diagrams contributing to 5-matrix elements would have nonzero contri¬ 
butions from unphysical polarizations. In Section 9.4, we used an argument 
special to the Abelian case to show that these problems do not arise in QED. 
In the non-Abelian case, the removal of unphysical gauge boson polarizations 
is more subtle, and we have seen that it involves the ghosts in an essential way. 
To resolve this subtle problem, we apply the principle of BRST symmetry. 

Let | A; tr) be an external state that contains no ghosts or antighosts and 
only gauge bosons with transverse polarization. We wish to show that the 
5-matrix projected onto such states is unitary: 

^ (A; tr| 5^ \C; tr) ( C ; tr| 5 \B; tr) = (A; tr| 1 \B; tr) . (16.54) 

c 

As we explained above, the physical states |A;tr) belong to—and, in fact, 
span—the subspace Ho defined by the BRST operator. In particular, all of 
these states are annihilated by Q. Since Q commutes with the Hamiltonian, 
the time evolution of any such state must also produce a state annihilated 
by Q. Thus, 

Q ■ S\A] tr) = 0. (16.55) 

This implies that the states 5 | A; tr) must be linear combinations of states in 
Ho and Hn- However, states in H -2 have zero inner product with one another 
and with states in Ho- Thus the inner product of any two states of the form 
5|A;tr) comes only from the overlap of the components in Ho, so we can 
write 


(A; tr| 5^ • 5 \B; tr) = (A; tr| 5^ \C\ tr) (C; tr| 5 \B; tr). 

c 


(16.56) 



16.5 One-Loop Divergences of Non-Abelian Gauge Theory 521 


Since the full 5-matrix is unitary, this relation implies that the restricted 5- 
matrix is also unitary, Eq. (16.54). In addition, (16.56) implies that the sum of 
the Cutkosky cuts of diagrams contributing to the 5-matrix in a given order 
is equal to the sum of the cuts involving transverse gauge bosons only. Thus, 
the cancellation between diagrams that produce pairs of gauge bosons with 
unphysical polarizations and those that produce ghosts is a general property 
that persists to all orders in perturbation theory. 

Since the BRST transformation generates a continuous symmetry, it gen¬ 
erates a set of Ward identities. These identities are similar in structure to the 
Ward identities of the non-Abelian gauge symmetry, since the BRST sym¬ 
metry contains a gauge transformation whose parameter is the ghost field. 
However, the identities that follow from BRST symmetry are simpler. We 
will not study the Ward identities of non-Abelian gauge theory further in 
this book. However, when one discusses the renormalization of gauge theo¬ 
ries at a higher level, the central identities among renormalization constants 
that follow from the Ward identities are most easily derived using the BRST 
symmetry.+ 

16.5 One-Loop Divergences of Non-Abelian 
Gauge Theory 

Now that we have discussed the general properties of tree-level diagrams in 
non-Abelian gauge theories, we turn our attention to diagrams with loops. As 
always in quantum field theory, some of these loop diagrams will diverge, and 
we must take care to treat the divergent integrals correctly. 

The Lagrangian of a non-Abelian gauge theory (15.39) contains no in¬ 
teractions of dimension higher than 4. Therefore, by the general arguments 
of Chapter 10, this Lagrangian is renormalizable, in the sense that the di¬ 
vergences can be removed by a finite number of counterterms. However, in 
non-Abelian gauge theories, as in QED, the gauge symmetries of the theory 
imply stronger restrictions on the structure of the divergences. In QED, pro¬ 
vided that we use a gauge-invariant regulator, there are only four possible 
divergent coefficients, which are subtracted by the counterterms for the elec¬ 
tromagnetic vertex (Si), for the electron and photon field strength (So and 
S 3 ), and for the electron mass (S m ). In particular, the possibility of a pho¬ 
ton mass renormalization is excluded by gauge invariance. Furthermore, the 
two counterterms <5i and So are equal to one another, and cancel in the eval¬ 
uation of the electron-photon vertex function, as a consequence of the Ward 
identity. Non-Abelian gauge symmetries imply similar restrictions on the di¬ 
vergences of Feynman diagrams. In this section, we will illustrate some of 
these restrictions through examples of one-loop digrams. 


+An introduction to tlie Ward identities of the BRST symmetry is given by Taylor 
(1976). 
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Figure 16.7. Contributions to the gauge boson self-energy in order g 2 . 


The Gauge Boson Self-Energy 

In QED, the strongest constraints of gauge invariance come in the evaluation 
of the photon self-energy. The Ward identity implies the relation 

= 0, (16.57) 

which in turn implies that the photon self-energy diagrams have the structure 

= i(gV" - g'V)II(r). (16.58) 



The only divergence possible is a logarithmically divergent contribution to 
n(<j 2 ). In non-Abelian gauge theories, (16.57) still holds, so the self-energy 
again has the Lorentz structure (16.58). However, the cancellations that lead 
to this structure are more complex. Here we will exhibit these cancellations 
by computing the gauge boson self-energy in detail at the one-loop level. In 
order to preserve gauge invariance, we will use dimensional regularization. 

The contributions of order g 2 to the gauge boson self-energy are shown 
in Fig. 16.7. (In addition to these 1PI diagrams, there are three “tadpole” 
diagrams; but these automatically vanish, as in QED, by the argument given 
below Eq. (10.5).) The fermion loop diagram can be considered separately 
from the other diagrams, since in principle we could include any number of 
fermions in the theory. We will see below that the contributions of the three 
remaining diagrams interlock in an essential way. 

Let us first calculate the fermion loop diagram. The Feynman rule for 
the vertices in this diagram is identical to the QED Feynman rule, except 
for the addition of a group matrix t a that acts on the fermion gauge group 
indices. The value of this diagram is therefore the same as in QED, Eq. (7.90), 
multiplied by a trace over group matrices: 


= tr[t a t b ]i(q 2 g l “' - q'Y) 

l 


-r 




! dx%x{ i -* ) 


r( 2 -f) 


(m 2 — x{l—x)q 2 ) 2 ~ d / 2 


The value of the trace is given by Eq. (15.78): tr[£ Q i 6 ] = C(r)S ab . In a theory 
with several species of fermions, there would be a diagram of this type for each 
species. We will be mainly interested in the divergent part of this diagram, 
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which is independent of the fermion mass. If there are rif species of fermions, 
all in the same representation r, then the total contribution of fermion loop 
diagrams takes the form 

£ ( ) 

fermions v x ( 16 . 59 ) 

= ifoV" - ■ l n f C(r)T(2-l) + •••). 

Now consider the three diagrams from the pure gauge sector. The contri¬ 
bution of these diagrams depends on the gauge; we will use Feynman-’t Hooft 
gauge, £ = 1. 

Using the three-gauge-boson vertex from Fig. 16.1, we can write the first 
of the three diagrams as 


1 /' d A p —i —i 

2 J ( 2 tt ) 4 p 2 ( p+q ) 2 


^2 jacd jbcd 


(16.60) 


where the numerator structure is 

= [g>‘P(q - P y + gP°(2p + q y + g^(-p - 2 q y] 

X [S v p(p - q)a + Qpai-Zp - qY + s v a (p + 2 q) p \. 

The overall factor of 1/2 is a symmetry factor. The contraction of structure 
constants can be evaluated using Eq. (15.93): f acd f bcd = C- 2 {G) 6 ab . 

To simplify the expression further, combine denominators in the standard 
way: 

= L_i_ 

p 2 (p + q ) 2 J ((1 —.r)p 2 + x(p + q) 2 ) 2 

0 0 

where P = p + xq and A = —x(l—x)q 2 . Then (16.60) can be rewritten 

1 

= -l c , {G ^J dx I^JL—L- N ,r 
0 

The numerator structure can be simplified by eliminating p in favor of P, 
discarding terms linear in (which integrate symmetrically to zero), and 


h 


1 


(P 2 - A)- 


(16.61) 
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replacing P n P v with g^P 2 /d, (also by symmetry): 

N = -<T [(2 q + p) 2 + (q- p ) 2 ] - d(q + 2 pY(q + 2p) v 

+ [(2 q + pY(q + 2 pf + (q - pY (2q + p) v - (q + 2 pY (q - pY 
+ (p O v)\ 

-g^P 2 ■ 6 ( 1 -^) - g^q 2 [( 2 —. t ) 2 + ( 1 + ;C ) 2 ] 

+ gV [(2—cl)(l—2;r) 2 + 2(1+*) (2-a:)]. 

The final step in the evaluation is to Wick-rotate and apply the integration 
formulae (7.85) and (7.86). This brings the diagram into the following form: 

l 

= (4$T^ G)i “7"W 

0 

x (Y(l-f) (f v q- [§(d-l)*(l-a:)] (16.62) 

+ T(2— l)g^q 2 [1(2- ;C ) 2 + I(1 +;C ) 2 ] 

- T(2-f) q^q v [(1-f )(l-2 ; r) 2 + (1 +.t)(2-.t)]) . 

Next consider the diagram with a four-gauge-boson vertex. Using the 
vertex Feynman rule in Fig. 16.1, we find 


1 

2 


/ 


d 4 p -ig po 


(27t) 4 p 2 


$ (—ig ) 


x [r be f cde {g w g va -g^g vp ) 
+ r/ Me (r/ ff -g^g vp ) 
+ f ade f hce (g^g pa -g^g v,r )\ 


(16.63) 


The factor 1/2 in the first line is a symmetry factor. The first combination of 
structure constants in the vertex factor vanishes by antisymmetry; the second 
and third can be reduced by the use of Eq. (15.93). We then find simply 


= -g 2 C- 2 (G) 6 ab / ' Y’Yd - 1). (16.64) 


In dimensional regularization, the integral over p gives a pole at d = 2 but 
yields zero as d —> 4. We could simply discard this diagram and trust that the 
pole at d = 2 is canceled by the other two diagrams. It is instructive, however, 
and no more difficult, to demonstrate the cancellation explicitly. To do so, we 
can force the integral to look like that of the previous diagram, multiplying the 
integrand by 1 in the form (g + p) 2 /(q + p) 2 . We then combine denominators 
as before, and eliminate p in favor of the shifted variable P = p + xq. After 
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dropping the term linear in P, we obtain 

i 

= -g 2 C- 2 (G) 6 ab Jdxj ff 4(p3 * A)3 g^(d-l)[P 2 +(l-x) 2 q 2 ]. 
0 

We can now Wick-rotate and integrate over P to obtain 

l 

S dx ^r- 

, ° (16.65) 

x (-r(l-f)^g 2 [l d (d-l)x(l-x)} 

-T( 2 -I)g^q 2 [(d-l)(l-x) 2 ]). 

Expressions (16.62) and (16.65), by themselves, do not add to any rea¬ 
sonable value: The pole at d = 2 does not cancel, and the sum does not have 
a transverse Lorentz structure. To bring the gauge boson self-energy into its 
desired form, we must include the diagram with a ghost loop. According to 
the rules shown in Fig. 16.5, this diagram is 



d A p i i 
(27r) 4 p 2 (p+q ) 2 


g 2 f dac {p+qY f cbd p v . 


(16.66) 


There is no symmetry factor in this case, but there is a factor of —1 because 
the ghost fields anticommute. The ghost diagram can be simplified using the 
same set of tricks that we applied to the previous two: combine denominators, 
shift the integral to P, Wick-rotate, and integrate over P using dimensional 
regularization. The result is 

i 

=(j $m c ^ 1 

, ° (16.67) 

x (-r(l-.f) 0 "V[i*(l-aO] 

+ T(2-f) a 'VMl-aO]). 

Now we are ready to put these results together. In the sum of the three 
diagrams, the coefficient of T(l — ^)g t “'q 2 x(l—x) is 

i(3 d - 3 - d 2 + d - 1) = (1 - f )(d - 2). (16.68) 

The first factor cancels the pole of the gamma function at d = 2. Thus, the 
sum of the three diagrams has no quadratic divergence and no gauge boson 
mass renormalization. Notice that the ghost diagram plays an essential role 
in this cancellation. 
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After the pole at d = 2 is canceled, T(l —|) becomes T(2—4). This term 
therefore combines with the others that are proportional to r(2—^)g fu 'q 2 , to 
give a total coefficient of 

(cl- 2)x(l-x) + |(2—;r) 2 + |(l+.'c) 2 - (d—1)(1 —.t) 2 . (16.69) 

Since the best way to simplify this expression is not obvious, let us put it 
aside and work first with the coefficient of T(2— ^)q /x q v : 

-(1-f )(l-2;r) 2 - (l-f;c)(2—;r) + x(l-x) = —(1 — tj)(1— 2;r) 2 - 2. 

If the total self-energy is to be proportional to ( g l ' v q 2 — q f ‘q 1 '), it must be 
possible to reduce expression (16.69) to this same form (times —1). To do so, 
note that A is symmetric with respect to x (1— x), and therefore we can 
substitute (1— x) for x in any term of the numerator. In particular, terms that 
are linear in x can be transformed as follows: 


x -> jx + )j(l-;r) = 

In the end, the sum of the three pure-gauge diagrams simplifies to 



r(2-f) 

/\2-d/2 


(s'V-g'V'HU 


|)(1—2:r) 2 + 2], (16.70) 


This expression is manifestly transverse, as required by the Ward identity of 
the non-Abelian gauge theory. For future reference, we record the ultraviolet 
divergent part of (16.70): 


(16.71) 

= i(<zV" ~ q>'qlS ab • (-^)cb(G)F(2-f) + •••). 


As we noted above, the result (16.70) depends on the gauge used in the cal¬ 
culation. In any gauge, the boson self-energy is transverse and free of quadratic 
divergences. However, the coefficient of the transverse Lorentz structure may 
depend on £. It turns out that, for a general value of £, the coefficient of the 
ultraviolet divergence in (16.71) is modified according to 


5 /13 

_ 3 ~^ _ ("6 



(16.72) 


The fact that the boson self-energy depends on the gauge does not contradict 
the general theorem that S'-matrix elements are independent of £. The full set 
of one-loop corrections to a gauge theory S'-matrix element always involves 
a number of different radiative corrections to vertices and propagators; the 
gauge dependence cancels in an intricate fashion among these various terms. 
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The f3 Function 

The simplest calculation that involves a gauge-invariant combination of radia¬ 
tive corrections is the computation of the leading term of the Callan-Symanzik 
(3 function of a non-Abelian gauge theory. The invariance of the leading term 
of 3 could be argued intuitively, by saying that the coupling constant of the 
gauge theory should not evolve to large values in one scheme of calculation 
while it stays small in another scheme. In Section 17.2 we will demonstrate 
this result more cleanly by showing that the leading coefficient of the 3 func¬ 
tion can be extracted from a physical cross section and so must be gauge 
independent. (Surprisingly, this conclusion actually applies to the first two 
coefficients of the 3 function, written as a power series in g.) 

Recall from Section 12.2 that the 3 function gives the rate at which the 
renormalized coupling constant changes as the renormalization scale M is 
increased. Since Green’s functions depend on M through the counterterms 
that subtract ultraviolet divergences, 3 can be computed from the counter¬ 
terms that enter an appropriately chosen Green’s function. For example, in 
Eq. (12.58), we saw that the 3 function of QED can be computed from the 
counterterms for the electron-photon vertex, the electron self-energy, and the 
photon self-energy. The same derivation goes through in the case of a non- 
Abelian gauge theory. Thus, to lowest order, 

P{9) = + y 3 ), (16.73) 

with the conventions for the counterterm vertices shown in Fig. 16.8. In QED, 
the first two terms cancel by the Ward identity, so 3 depends only on 83 . In the 
non-Abelian case, all three terms contribute. The most difficult to compute 
is S 3 , but we have nearly done so already by computing the gauge-boson self¬ 
energy diagrams. Let us now complete this calculation of the 3 function of 
non-Abelian gauge theory. 

In order for the counterterm 83 to cancel the divergence of Eqs. (16.59) 
and (16.71), it must be of the form 

* = (16.74) 

where M is the renormalization scale. Depending on the precise renormal¬ 
ization conditions used, there may be additional finite contributions to 83 , 
but these do not contribute to the 3 function (to one-loop order). Similarly, 
the finite parts of 82 and <5i will depend on the details of the renormaliza¬ 
tion scheme. However, as we saw in Section 12.2, the one-loop contribution 
to the 3 function is the same in any scheme in which amplitudes are renor¬ 
malized at a point where all momentum invariants are of the same order M 2 . 
In dimensional regularization, a logarithmic divergence always takes the form 
T(2—^)/A 2_d / 2 , where A is some combination of momentum invariants. Thus, 
to compute the 3 function, we can simply set A = M ' 2 in such expressions. 



528 Chapter 16 


Quantization of Non-Abelian Gauge Theories 


Figure 16.8. Counterterms needed for computing fermion interactions in a 
non-Abelian gauge theory. 


Figure 16.9. Diagrams whose divergences are subtracted by the counter¬ 
terms So and 8\. 

To complete the computation of the /I function, we must compute So 
and d'i to the same level of approximation. The fermion self-energy coun¬ 
terterm So cancels the divergence proportional to 1/ in the first diagram of 
Fig. 16.9. In Feynman-’t Hooft gauge, the value of this diagram is 



M + ¥) , a ~i 

(p + fc)- " p 2 ' 


(16.75) 


Since the divergence in the field strength renormalization is independent of 
the fermion mass, we have simplified (16.75) by setting the mass to zero. The 
product of group matrices equals the quadratic Casimir operator, by defini¬ 
tion (15.92). The Dirac matrix structure can be reduced using a contraction 
identity (7.89). The rest of the calculation follows the same steps as for the 
boson self-energy diagrams: 



d 4 p (tf + //; 
(27r) 4 (p + k) 2 p 2 


= g 2 C 2 (r)(d- 2) 


1 


hi 


d 4 P (l-x)# 
(2 n) 4 ( P 2 - A) 2 


= J^jd/ 2 C - 2 h¥ J dx(l-x)(d-2) 

0 


r(2-f) 

\2~d/2 
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= ^T^ 2 (r )r( 2-| ) + .-.. (16.76) 

(Here P = p + xk and A = —x{l—x)k 2 .) 

The divergent part of this expression must be canceled by the second 
counterterm diagram of Fig. 16.8. Thus, if the renormalization scale is M, the 
counterterm must be 


<52 


a 2 £(2-f) 

(4tt) 2 (M 2 ) 2 - d /2 




(16.77) 


plus finite terms. We note that, like S 3 , 62 depends on the gauge; for example, 
82 has no one-loop divergence in Landau gauge (£ = 0). 

To determine Si, we must compute the second and third diagrams of 
Fig. 16.9. The second diagram, computed in Feynman-’t Hooft gauge and for 
massless fermions, is 



g 3 t b t a t b 


(p + k') 2 (p + k)' 2 p 2 


(16.78) 


The gauge group matrices can be simplified according to 

t b t a t b =t b t b t a +t b [t a ,t b ] 

= C 2 (r)t a +it b f abc t c 
= C 2 (r)t a + | if abc ■ if bcd t d 
= [C 2 (r)-hC 2 (G)]t a . 


(16.79) 


In the third line we have used the antisymmetry of f abc to rewrite the matrix 
product as a commutator; in the last line we have used Eq. (15.93). 

The diagrams computed earlier in this section had positive superficial 
degrees of divergence, so we needed to extract their logarithmic divergences 
carefully. The integral in (16.78), however, is superficially logarithmically di¬ 
vergent, and so the coefficient of this divergence can be extracted easily by 
considering the limit in which the integration variable p is much greater than 
any external momentum. In this limit, the diagram is estimated as follows: 


~ 9 3 [C 2 {r ) 


\C 2 {G)]t a 


f d 4 p 

J (27r) 4 p 2 ■ p 2 ■ p 2 


(16.80) 


If we replace p p p a by g pa p 2 /d in the numerator of (16.80), this expression 
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simplifies easily: 


~ 9 s [C 3 (r) - ^C-AGWCl-df-Y f^£-~ 

’ ’ (16.81) 

~ j£p\c aw - + •••). 

This estimate gives the correct coefficient of the divergent term. It drops 
completely the finite terms in the vertex function, but we do not need these 
to compute the (3 function. 

The third diagram of Fig. 16.9 can be analyzed in the same way. Its value, 
in Feynman-’t Hooft gauge and for massless fermions, is 



d 4 p 

C 2 ny l 


{igjvt b )-^(igj P t c ) 


—i 


—i 


( k'—p ) 2 ( k—p ) 2 


x gf abc [g^(2k'-k-p) p + g vp {-k'-k+2p) 


p 


+ g pp {2k-k'- P y]. ( 16 . 82 ) 


The gauge matrix product can be reduced as follows: 

f abc t b t c = i f abc ■ if bcd t d = l -C 2 (G)t a . 

Again we can determine the logarithmic divergence of this diagram by neglect¬ 
ing all external momenta in comparison with p. A straightforward calculation 
then yields 








9 s MV , - VV + S r V 

T ft<G >* J WP"*'" -W- 

jC-AGr I j - w*+vw] 

yy §Ci(G) f 7 «(r( 2 - i) +.••). (16.83) 


In the second line we have again replaced p p p a with g pa p 2 /d. 

The sum of the divergences in results (16.81) and (16.83) must be canceled 
by the third counterterm diagram in Fig 16.8. With a renormalization scale 
of M, we find 


Si 


g 2 r( 2 -f) 

(4tt) 2 (M 2 )2- d /2 


[C 2 (r) + C 2 (G)\. 


(16.84) 


Notice that Si is not equal to S 2 , as would have been true in the Abelian case; 
here Si has an extra term, proportional to C 2 (G). 
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We are now ready to compute the 0 function. Plugging the three coun¬ 
terterms (16.74), (16.77), and (16.84) into our formula (16.73), we find 

m = (-2) [(C 2 (r) + C 2 (G)) - C 2 (r ) + \ (Jg 2 (G) - |n/G(r))]; 

that is, 

rn = [yC^G) -1 ■ ( 16 - 85 ) 

Notice that, at least for small values of n/, the 0 function is negative and so 
non-Abelian gauge theories are asymptotically free. This is a result of excep¬ 
tional physical importance, first discovered by ’t Hooft, Politzer, and Gross 
and Wilczek.* We will discuss the physical interpretation of this result fur¬ 
ther in Section 16.7, and in the next several chapters. However, for the rest 
of this section, we will resist the temptation to pursue the physics and in¬ 
stead complete our technical analysis of the divergences of non-Abelian gauge 
theories. 

Relations among Counterterms 

In the analysis just completed, we computed the j3 function of a non-Abelian 
gauge theory from the divergences of the fermion vertex and field strength 
renormalizations. One might visualize that we were computing the running 
of the coupling constant at the fermion-gauge boson vertex. Alternatively, 
we could have studied the divergences of the three-gauge-boson vertex or the 
four-gauge-boson vertex, and thus computed the running of these coupling 
constants. However, we saw already in Section 16.1 that non-Abelian gauge 
invariance knits together these separate coupling constants and requires their 
equality. Thus we might expect that these different calculations should pro¬ 
duce the same value of the 0 function. 

To clarify this issue, let us carefully enumerate all the counterterms that 
appear in a non-Abelian gauge theory. We start from the Lagrangian (16.34), 
regarded as a combination of bare fields and a bare coupling constant. In the 
following discussion, we denote bare quantities by the subscript 0. Then, 

C = -i(5„A^ - dvA^) 2 + AA'ifi ~ m 0 )ipo - c%d 2 c% 

+ - gof ab AdMA b 0 fl A^ (16.86) 

- \g 2 (f ab A^Al)(f cd A^At) ~gc a 0 r bc d»AlA 0 . 

We choose £ = oo for simplicity. We now rescale the fields to the renormalized 
field strengths by extracting the factors Z 2 , Z 3 , Z 2 for the fermions, gauge 


*G. ’t Hooft, unpublished; H. D. Politzer, Phvs. Rev. Lett. 30, 1346 (1973); D. J. 
Gross and F. Wilczek, Pliys. Rev. Lett. 30, 1343 (1973). 
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bosons, and ghosts, and shift the coupling to the renormalized coupling g. 
The Lagrangian then takes the form 


where £ ren is the Lagrangian (16.34) and £ c . t . takes the form 

£ c .t. = ~\h{d,Al - d v Al) 2 + i,(iS 2 <t> - 8 m )ip - <5oC ffl 3 2 c“ 

+ g 8 1 A$'fil> - g5\ a f abc {d,Al)A^Al (16.87) 

- g 2 8t 9 (f ab AlAl)(r d AlAt)-g6rc a f abc d^Ay, 

with the counterterms defined by 

A. = Z 2 - 1, S 3 = Z 3 - 1, 6' = Z.? - 1, 8 m = Z 2 m 0 - m, 

<*i = ^Z 2 (Z 3 ) 1 / 2 - 1, 8 f 9 = ^(Z 3 ) 3 / 2 - 1, 

a a 

8 t 9 = ^|(Z 3 ) 2 - 1, = ^Zd(Zg) 1 / 2 _ l. (16.88) 

a~ a 

Notice that these eight counterterms depend on five underlying parameters; 
thus, there are three relations among them. The situation is very similar to 
that for the scalar theories with spontaneously broken symmetry that we stud¬ 
ied in Chapter 11. The underlying symmetry of the theory—here, local gauge 
invariance—implies relations among the divergent amplitudes of the theory 
and among the counterterms required to cancel them. In the present case, a 
set of five renormalization conditions uniquely specifies all of the counterterms 
in a way that removes all divergences from the theory. 

This program is especially simple at one-loop order. In this case we can 
expand go/g and the various Z factors about 1, keeping only the leading- 
order contribution to each counterterm. Then the three relations among the 
counterterms can be written 

Si - s 2 = S 35 - S 3 = Hsf 9 - 8 $ = ^ - 8 $. (16.89) 

It is instructive to check explicitly that the values of 8f 9 , 8f 9 , and de¬ 
termined from (16.89) indeed remove the divergences of the corresponding 
vertex diagrams; this is the subject of Problem 16.3. Using relations (16.89), 
it is easy to show that the one-loop calculation of the 3 function will yield the 
same value, whichever gauge boson vertex is used in the computation. More 
generally, consider a non-Abelian gauge theory with many different species 
of particles, bosons and fermions, which couple to the gauge field. Then, to 
one-loop order, the quantity 

8 ‘ — 8 ‘ 

(q o - 2 , 
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where S\ is the vertex counterterm for species i and So is the corresponding 
field strength counterterm, takes a universal value. This value is gauge de¬ 
pendent, so that the gauge dependence of its divergent part cancels the gauge 
dependence of S 3 in the computation of the (3 function. 

In our discussion of the counterterms of QED at the end of Section 10.3, 
we remarked that the relation between d'i and 82 insured that all electrically 
charged species see a common universal value of the coupling constant e. In 
non-Abelian gauge theories, the relations (16.89) and their higher-loop gener¬ 
alizations preserve the universality of the non-Abelian couplings. In QED, we 
were able to obtain an even stronger relation, <5i = So or Z\ = Zo, from the ab¬ 
solute normalization of the matrix elements of the vector current. However, in 
non-Abelian gauge theories, the corresponding vector current j ,ia = 
transforms under local gauge transformations in the adjoint representation. 
Thus the Faddeev-Popov prescription cannot be used to compute matrix el¬ 
ements of this current unambiguously, and thus the normalization of these 
matrix elements is not preserved by the perturbation theory. 


16.6 Asymptotic Freedom: The Background 
Field Method 

In the previous section, we saw that the f3 function of a non-Abelian gauge 
theory with a sufficiently small number of fermions is negative. This result 
is important enough that it is worthwhile to derive it twice. The preceding 
derivation was straightforward but not very illuminating. In this section we 
give a second derivation of the same result, which is more abstract but much 
cleaner and more transparent. 

The method of this section reflects the spirit of Wilson’s idea of inte¬ 
grating out the high-momentum degrees of freedom, while taking proper care 
to preserve gauge invariance. We will compute the effective action of a non- 
Abelian gauge theory for a fixed, slowly varying, classical background gauge 
field A®(;r). By adopting a canonical normalization of this field, we can in¬ 
terpret the coefficient of the effective action as a running coupling constant. 
This method is analogous to Polyakov’s method for computing the 3 function 
of the nonlinear sigma model, presented in Section 13.3. 


Background Field Perturbation Theory 

To set up the computation, rescale the gauge field gA “ —> A®. In this nor¬ 
malization, the gauge coupling is removed from the covariant derivative and 
moved to the coefficient of the gauge field kinetic energy term. We thus start 
from the Lagrangian 

C= 4 


(16.90) 
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with 


D tJ = d 4 , - iA“t a , 

^ = 5,a®-^a®+/®^a®, 


(16.91) 


and the fermion mass set to zero for simplicity. The transformation laws of 
A® and x[> are also independent of the coupling constant: 


SA% = 5„a® + f abc A b l a c , dtp = icAFtb. (16.92) 


On the other hand, the coupling constant g will appear in the gauge field 
propagator. 

Next, split the gauge field into a classical background field and a fluctu¬ 
ating quantum field: 

A“^A“+M“. (16.93) 

We will treat the classical part A® as a fixed field configuration and the 
fluctuating part A“ as the integration variable of the functional integral. From 
here on, we will use the symbol D M to denote the covariant derivative with 
respect to the background field: D fl = d fl — *A®i®. Then 


ip(i]p)ip —> ip(ilp)ip + (16.94) 

The Yang-Mills field strength decomposes as follows: 

F% -)■ 3„A® - d v Al + f abc A\A% 

+ d,Al - d v Al + f abc (A^Al - A b v AD + f abc A\A c v (16.95) 

— F a 4- n A a — n A a + f abc A b a c 

where, in the last line, F®^ is the field strength of the classical field, and 
is the covariant derivative in the adjoint representation, Eq. (15.86). Notice 
that, both in (16.94) and in (16.95), the derivative d M appears only as a part 
of the covariant derivative with respect to the background field. 

If the background field A® is regarded as fixed, the Lagrangian has a local 
gauge symmetry implemented by transformations on M“: 

A“ -4 M“ + D + f ahc A b ,<A, (16.96) 


To define the functional integral, we must gauge-fix using the Faddeev-Popov 
procedure. We choose a gauge-fixing condition that is covariant with respect 
to the background gauge field: 


G(A) = CA;,-u“, (16.97) 

instead of (16.28). The Faddeev-Popov determinant involves the variation of 
this operator with respect to the gauge transformation (16.96). As in Section 
16.2, we can promote the gauge-fixing term to the exponent, to quantize the 
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theory in the background field analogue of Feynman-’t Hooft gauge. Then the 
gauge-fixed Lagrangian is 


£fp = -772 + D ^ - D » A l + S ahc K A lY - 


4 cf 


2 g 2 

+ t/,(i0 + AWfW + c a (-D 2 - D»f abc A b )c c . 


(16.98) 


The Lagrangian (16.98) is gauge-fixed, but it is invariant under a local 
symmetry that transforms both A“ and the background field A“: 


A-^A;+D,f3 a 

a; -)• a; - r bc /3 b Ai 

ip -> ip + i/3 a t a ip 


(16.99) 


Under this transformation, A “ transforms as a matter field in the adjoint 
representation, while A* carries the part of the local gauge transformation 
proportional to d t ,j3 a . To prove that (16.99) is a symmetry of (16.98), we need 
only note that (16.98) is globally invariant, and that appears in (16.98) 
only as a part of the covariant derivative and the field strength. The trans¬ 
formation (16.98) is also a symmetry of the functional measure. Thus, if we 
functionally integrate over A", , ip, and c“ to compute the effective action, the 
result must be invariant to local gauge transformations of . This observa¬ 
tion greatly simplifies the analysis of the effective action. 


One-Loop Correction to the Effective Action 

Let us now compute the effective action, using the method of Section 11.4. To 
compute r[A“] to one-loop order, we drop terms linear in the fluctuating field 
and then integrate over the terms quadratic in and the fermion and 
ghost fields. This produces functional determinants, which we can evaluate 
into an appropriate form for an effective action. 

To carry out this program, we must work out the terms in (16.98) 
quadratic in each of the various fields. The terms quadratic in are: 

A/4 = ~^{\(D,Al - D V AD' 2 + F a ^f abc A b t X + (£M“) 2 }. (16.100) 

After integrating by parts, we can rewrite this as 

C A = -- 4 {A“[-( J D 2 ) a V ! ' + {D v D») ab - (D ll D v ) ab ]Al-Alf abc F b ^Al). 

r (16.101) 
The term in brackets contains the commutator of covariant derivatives. This 
can be simplified using (15.48); the result combines with the last term to give 

£4 = -A^[-(fl 2 )Y' - V abc F b ^]Al). 


(16.102) 
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The first term is part of a covariant d’Alembertian operator. The second term 
seems quite special, but we can put it into a form that will be convenient later 
as follows: First, we recognize that F^ v is contracted with a group generator 
in the adjoint representation. Next, we introduce the matrix (3.18) that is the 
generator of Lorentz transformations on 4-vectors: 

= (16.103) 

With these replacements, we can write (16.102) in the form 

£a = +2 (kF b pa j^r(t b G r]Ai}. (I6.104) 

The object in brackets can be considered as a generalized d’Alembertian for 
fluctuations on the background field. 

Next, we reduce the quadratic terms in fermion fields in a similar way. 
The quadratic Lagrangian for the fermion field is 

'%lp)tl>. (16.105) 


Integrating over the fermion fields, we find the determinant of the operator 
(ilfi). This is conveniently expressed as the square root of the determinant of 
the operator 

m 2 = -yyd.d, 

= {-Hi^i v }-\[Y,Y])d,d v (16106) 

= -D 2 + 2i{±['f,'f / ])D lt D v . 

In the last line, the commutator of Dirac matrices forms the generator of 
Lorentz transformations in the spinor representation, S (3.23). Since this 
object is antisymmetric in its indices, the product D^D,, that is contracted 
with it can be replaced by half of their commutator. Then (16.106) takes the 
form 


(ip)’ 2 = -D' 2 + 2{\F b P!T S p(T )t b 


(16.107) 


where t a is now given in the representation of the fermions. This is just the 
d’Alembertian in (16.104), rewritten for the new set of spin and gauge quan¬ 
tum numbers. If the theory contains nj species of fermions, the fermionic 
functional integral gives the determinant of (16.107) raised to the power nf/2. 

The quadratic term in ghosts is simply 

C c = c a \-(D' 2 ) ab ]c b ; (16.108) 


This contains the same d’Alembertian operator written for the case of spin 
zero. 

To summarize all of these results, we define the general covariant back¬ 
ground-field d’Alembertian as 

A rJ = -D 2 + 2{\F b pa jn, 


(16.109) 
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acting on a field of representation r and spin j. The square of the covariant 
derivative gives the normal, convective, minimal coupling of the particle de¬ 
scribed by A r _j to the gauge field. The additional term is a magnetic moment 
interaction with the gauge field, whose strength corresponds to a 77 -factor 
<7 = 2. Using this general expression, we can write the effective action for the 
classical fields A® , to one-loop order, as 


= j VAVi/jV a, expj^i j d 4 x (£ F p + £ c .t.)j 
= exp [*y d 4 x(--^(F«J 2 + £ c . t .)] 

• (det A Gj i) 1/_ (det A ril/2 ) + " //2 


(det A Gj0 ) +1 , 


(16.110) 


where £ c . t . is the counterterm Lagrangian and the three determinants are the 
results of evaluating the gauge field, fermion, and ghost functional integrals. 
Additional loop corrections to the effective action are suppressed by another 
factor of g' 2 . 

Since each integral contributing to (16.110) is invariant to (16.99), each 
determinant will be a gauge-invariant functional of A“. If we expand the 
determinants in powers of the background field, we should then find a series 
of terms that begins 


log det A r j 


if d 4 x (jc,+ • • 


O’ 


(16.111) 


where the succeeding terms contain higher-dimension gauge-invariant opera¬ 
tors. The coefficient C r j can depend on the representation r and the spin j. 
This first term of the expansion modifies the zeroth-order effective action ac¬ 
cording to 

-»• + l c m - c G , o - fc r , 1/2 )(F;) 2 . (16.112) 

The factors C r j are dimensionless but, since they arise from a one-loop com¬ 
putation, we should expect that they are logarithmically divergent: 

A 2 

c o j = Crj log — + • • ■■J (16.113) 

where k is a momentum characterizing the variation of the background field. 
The counterterm <^3 removes the divergence; if we impose a renormalization 
condition at the scale M, then the addition of (16.113) and its counterterm 
gives the result (16.112) with the replacement 


C r,j ~ C r ,j log + 


(16.114) 
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Then the original fixed coupling constant in the effective action is replaced by 
a running coupling constant 

1 1 /I n f \, M 2 

7m = 7 + ' 2 cgu “ CG -° “ 7 c '-'iv ° s "p"' (lfU15) 

or 

9 2 (k 2 ) = —-j - 9 - - ■ (16.116) 

1 - ( 2 c g,i ~ c G ,o - yf-G-, 1 / 2 ).'/' log/,-/l/- 

By comparing this form to Eq. (12.88), we see that this running coupling con¬ 
stant is the solution to the renormalization group equation for the j3 function 

•%) = (\ C G ,i - c Gj0 - ■y'Ad/a)5 3 - (16.117) 

Thus, by calculating the c r j, we can directly obtain the leading coefficient of 
the j3 function. 

Computation of the Functional Determinants 

To compute c r j, we must work out the first term in the expansion of the 
determinant in powers of the external field. To expand the determinant, we 
proceed as in the example in Section 9.5. Write 

A rJ = -d 2 + A (1) +A (2) + A (J) , (16.118) 

where 

A (1) = + Ap a d»] 

A (2) = A a n a A b / (16.119) 

A (J) = 2{\F b pa J pa ). 

The pieces A (1) and A^ 1 contain one power of the external field; A (2) contains 
two powers of A p . Treating these terms as perturbations, we write 

logdet A,.j = logdet[-<9 2 + (A (1) + A (2) + A t ' J) )\ 

= logdet[-9 2 ] + logdet[l + (-9 2 ) -1 (A (1) + A (2) + A (:;r) )] 

= logdet[-9 2 ] + tr log[l + (-9 2 ) _1 (A (1) + A (2) + A^)] 

= log det[-9 2 ] + tr[(-a 2 )“ 1 (A (1) +A (2) + A (J) ) + •••]. 

(16.120) 

The first term of the right in (16.120) is an irrelevant constant. The terms 
in this expansion that are linear in A p vanish by gauge invariance (or, more 
explicitly, because tr [t a ] = 0). The quadratic terms in ,4“ must organize them¬ 
selves into the structure of (16.111), plus terms with higher derivatives. 
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Figure 16.10. Terms quadratic in the external field in the expansion of 
logdet A r j. The special vertex arises from the F pa J pa coupling. 

The terms in (16.111) quadratic in .4“ can be written in Fourier space as 

logdet A rJ = l -j ^Al(-k)A b „(k)(k 2 ^ - V‘k") ■ [C rJ + 0{k 2 )]. 

(16.121) 

We will now compute these terms explicitly from (16.120) and bring them 
into the form of (16.121). The terms with two powers of .4“ in the expan¬ 
sion (16.120) are those with one power of A (2) or two powers of A (1) or A(^). 
Further, terms linear in A (:;r) are proportional to tr [J pa ] = 0, so the cross 
term between these two structures vanishes. The three remaining contribu¬ 
tions correspond to the Feynman diagrams shown in Fig. 16.10. 

The term involving two powers of A^ 1 ) is 


— I tr [(—a' 2 )-i A (1)(—O' 2 ) - A (D] = 


' d ' k mb 
( 2 tt ) 4 " " 


/ 


lJL tI ^ i2p + kre _L_ 


r(2 p+kyt», 


(16.122) 

where the trace is now simply a trace over gauge and spin indices. The factor 
1/2 comes from the expansion of the logarithm. The term involving one power 
of A( 2 ' is 


tr[(—d 2 )-^ 2 '] = 


[JlL A a 4-’ f^p- tr -,rrr 

J (2tt) 4 "■ v J (2t r) 4 p 2 J 


(16.123) 


As Fig. 16.10 suggests, these two contributions are precisely proportional to 
the contribution of a scalar particle to the QED vacuum polarization, times 
the factor 

tr [t a t b ] = C(r)d(j)S ab , (16.124) 


where d(j) is the number of spin components. The values of the diagrams can 
be worked out using the methods of the previous section (or simply recalled 
from Problem 9.1). One finds that the two diagrams together sum up to the 
gauge-invariant form (16.121), to give 


h f-^ A l(-Q A l(k)(k*g» 


knn ' [ ^3(4 fy T{2 -^ + ' • 'I • (16 ' 125) 
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The term involving two powers of is 

-|tr[(-d 2 )- 1 A^(-3 2 r 1 A<‘ 7) ] = 


d 4 k 


A a A b 


d4 P tr l. ( 2ik p g lia jr°)t a - 


;(—2ik a g„pJ a P)t b . 

(16.126) 


(2tt) 4 " V J (2t r) 4 p 2 v pJpa ’ ( p+k)' 2 

To evaluate this, define C(j ) as the trace over spin indices 

tr [J pa J al3 ] = {g pa g a0 - g pp g att )C{j). (16.127) 

It is straightforward to work out from the explicit expressions that 


{ 0 scalars; 

1 Dirac spinors; 
2 4-vectors. 


(16.128) 


Then (16.126) can be evaluated as 


d 4 k 


A a A b 


d 4 p 1 


1 


(27r) 4 " " 

(4 


(2n) 4 p 2 (p + k ) 2 


{k 2 g pv - k p k I ')4C(r)C(j) 


1 r r nb , 

2 1 - k p kn(-i- 


AC(r)C(j) 


(4tt) 2 


r (2-f)+ •••). 

(16.129) 


Adding (16.125) and (16.129), we find that the coefficient C r j in (16.111) 
is given by 

l-[ld(j)-4C(j)]C(r)T(2-l). (16.130) 


Cr ' J ~ (4t r) 2 


Thus, 


or explicitly, 


C r _ 4 — 


(4tt) 


2 [fd(j)-4 C(j)]C(r), 


C r 4 — 


C(r) f +1/3 scalars; 

< —8/3 Dirac spinors; 


(4tt) 


[ —20/3 4-vectors. 


(16.131) 


(16.132) 


Notice that, whenever the magnetic moment term is nonzero, it dominates, 
and that its coefficient is opposite in sign from the convective term. 

Inserting the values from (16.132) into (16.117), we find 


« 9 > = -(ly(T Cs,0) ^'‘ ,C(r) )' 


(16.133) 


We thus confirm the conclusion of the previous section, that non-Abelian 
gauge theories with sufficiently few fermions are asymptotically free. 
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In the previous two sections 1 ' we twice calculated the J3 function in non-Abelian 
gauge theory: 


m = “(J^(y C2(G) ~ |^ G(r) )- (16 - 134) 


Here rif is the number of fermion species in representation r, C(r) is the con¬ 
stant appearing in the orthogonality relation (15.78) for the representation 
matrices, and Co(G) is the quadratic Casimir operator of the adjoint repre¬ 
sentation of the group, defined in Eq. (15.92). In an SU(N) gauge theory with 
fermions in the fundamental representation, this result becomes 

W9) = ~^(T JV ~i n ')- (lfU35) 

The overall minus sign implies that, for sufficiently small n/, non-Abelian 
gauge theories are asymptotically free. In this case the running coupling con¬ 
stant tends to zero at large momenta, according to Eq. (12.92): 


g 2 (k) 


i + x£i*(f N -'in f )\og(kyMiy 


(16.136) 


The asymptotic freedom of non-Abelian gauge theories is a surprising 
conclusion. When we first encountered the running of the electromagnetic 
coupling in Section 7.5, we found it easy to understand the direction of the 
coupling constant flow: The vacuum acquires a dielectric property due to 
virtual electron-positron pair creation, causing the effective electric charge 
to decrease at large distances. In non-Abelian gauge theories, according to 
Eq. (16.134), the fermions still produce such an effect. Furthermore, since the 
non-Abelian gauge bosons are charged, they should produce an additional 
screening effect. But according to Eq. (16.134), the net effect of the non- 
Abelian gauge bosons is opposite in sign. Apparently there must be other, 
competing, effects, which overcome the effect of screening and change the 
sign of the 3 function. 

The precise form of these effects depends on the gauge. They are simplest 
to describe in the Coulomb gauge, for which the gauge fixing condition is 


diA ai = 0 . 


(16.137) 


We will not work out the full quantization in this gauge; rather, we will just 
describe its qualitative features. As in electrodynamics, the field quanta in 
Coulomb gauge are described in a non-Lorentz-covariant manner as trans¬ 
versely polarized photons. There are no timelike or longitudinal photons and 


fSection 16.7 draws on tlie main result of 16.5 and 16.6, but does not depend 
on these earlier sections. However, even if you have not read Section 16.5, you may 
wish to skim pages 522 through 531 to get an overview of how the ,/3 function can be 
calculated. 
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no propagating ghosts. However, there is a Coulomb potential, described by 
the field A“°, which obeys a constraint equation analogous to Gauss’s law. Not 
surprisingly, in the non-Abelian case, Gauss’s law takes the gauge-covariant 
form 

D,E ai = gp a , (16.138) 

where E at = F a0 ‘ and p a is the charge density of the global symmetry current 
of the fermions. Recall from Eq. (15.86) that the covariant derivative acting 
on a field in the adjoint representation is 

(D^r = d,4> a + gf abc A\\4> c . 

To analyze the consequences of Eq. (16.138), we will choose an example 
as simple and explicit as possible. Let the gauge group be SU{2), so that 
a = 1,2,3 and f abc = e abc . Let us compute the Coulomb potential of a point 
charge of magnitude +1 with the orientation a = 1. We will solve for E at using 
an iteration process, putting the gauge-field term of the covariant derivative 
on the right-hand side of the equation: 

di,E ai = g8 (3) {x)8 al + ge abc A bi E ci . (16.139) 

The second term on the right shows that, in a non-Abelian gauge theory, 
a region containing vector potentials and electric fields that are parallel in 
physical space and perpendicular in the group space is a source of electric 
field. 

The implication of Eq. (16.139) is worked out pictorially in Fig. 16.11. 
The leading term on the right-hand side of (16.139) implies a 1/r 2 electric 
field of type a = 1 radiating from x = 0. Somewhere in space, this electric 
field will cross with a bit of vector potential A at arising as a fluctuation of the 
vacuum. For definiteness, let us assume that this fluctuation has a = 2 and 
points in some diagonal direction, as shown in Fig. 16.11(a). Then the second 
term on the right-hand side of Eq. (16.139) is negative for a = 3: There is a 
sink of the field E 3t at this location, as shown in Fig. 16.11(b). These new 
fields are, in two locations, parallel or antiparallel to the original A at field 
fluctuation. Looking again at the second term of Eq. (16.139), we see that 
there is a source of electric field with a = 1 closer to the origin, and a sink of 
electric field with a = 1 farther away. This is an induced electric dipole in the 
vacuum, shown in Fig. 16.11(c). But look at the signs: This dipole is oriented 
toward the original charge, and thus serves to amplify rather than screen it! 
The effect of the original charge thus becomes stronger at larger distances. 

The competition between this antiscreening effect and the screening due 
to virtual pairs of gauge bosons must be worked out quantitatively. When this 
is done,- 1 - one finds that the antiscreening effect is 12 times larger. 

In this argument, it is a set of dynamical features peculiar to the non- 
Abelian gauge theory that enables the coupling constant to be amplified rather 


$T. Appelquist, M. Dine, and I. Muzinich, Phvs. Lett. 69B, 231 (1977). 
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Figure 16.11. The effect of vacuum fluctuations on the Coulomb field of 
an SU(2) gauge theory. In (a), a fluctuation A 2 occurs on top of the 1/r 2 
field E 1 . These combined fields generate a sink of the field E 3 , as shown 
in (b). Tlie E 3 field, in turn, combines with A 2 to create an effective E 1 
dipole, shown in (c). The dipole points toward the original charge, enhancing 
its field at large distances. 

than screened at large distances. This suggests that asymptotic freedom might 
be a special property of non-Abelian gauge theories. Although the statement 
can be proved only by exhausting other cases, it does actually turn out to 
be true: Among renormalizable quantum field theories in four spacetime di¬ 
mensions, only the non-Abelian gauge theories are asymptotically free.* We 
have already seen in Chapter 14 that asymptotic freedom was suggested ex¬ 
perimentally as a property of the strong interactions. In the following chapter 
we will build a model of the strong interactions out of a non-Abelian gauge 
theory and explore its properties in detail. 

*S. Coleman and D. J. Gross, Phvs. Rev. Lett. 31, 851 (1973). 
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Problems 

16.1 Arnowitt-Fickler gauge. Perform the Faddeev-Popov quantization of Yang- 
Mills theory in the gauge A 3a = 0, and write the Feynman rules. Show that there 
are no propagating ghosts, and that the gauge field is reduced to two positive-metric 
degrees of freedom. (Although the gauge condition violates Lorentz invariance, this 
symmetry is restored in the calculation of gauge-invariant 5-matrix elements.) 

16.2 Scalar field with non-Abelian charge. Consider a non-Abelian gauge theory 
with gauge group G. Couple to this theory a complex scalar field in the representation r. 

(a) Show that the Feynman rules for the scalar field are a simple modification of 
the Feynman rules displayed for scalar QED in Problem 9.1(a). 

(b) Compute the contribution of this scalar field to the /3 function, and show that 
the full (5 function for this theory is 

Cs(g >-! c( 4 

16.3 Counterterm relations. In Section 16.5, we computed the divergent parts of 
<5i, 5-2, and 5 3 . It is a good exercise to compute the divergent parts of the remaining 
counterterms in Eq. (16.88) to one-loop order in the Feynman-’t Hooft gauge, and to 
explicitly verify that the counterterm relations (16.89) are consistent with the removal 
of ultraviolet divergences. 

(a) The ghost counterterms are particularly easy to compute. Work out <5j and 5%, 
and show that the divergent part of their difference equals the divergent part 
of <5q — So . This gives a derivation of asymptotic freedom that is slightly easier 
than the one in Section 16.5. 

(b) Compute the counterterm for the 3-gauge-boson vertex and verify the first equal¬ 
ity in (16.89). 

(c) Compute the counterterm for the 4-gauge-boson vertex and find, when the smoke 
clears, the second relation in (16.89). 



Chapter 17 


Quantum Chromodynamics 


The key to constructing a realistic model of the strong interactions is the 
phenomenon of asymptotic freedom. Chapter 14 described the experimental 
discovery of this phenomenon, while Chapter 16 presented the theoretical 
proof that non-Abelian gauge theories are asymptotically free. We are now 
ready to explore the consequences of these discoveries. 

We will begin by arguing that the most natural candidate for a model 
of the strong interactions is the non-Abelian gauge theory with gauge group 
SU( 3), coupled to fermions (quarks) in the fundamental representation. This 
theory is known as Quantum Chromodynamics , or QCD. After some general 
discussion of QCD in Section 17.1, we will investigate a number of specific 
QCD scattering processes in Sections 17.2 through 17.4. The most interesting 
application of QCD, however, is of a somewhat more sophisticated nature; it 
comes in the prediction of a pattern of slow violations of the Bjorken scaling 
relation discussed in Chapter 14. Section 17.5 develops the additional theo¬ 
retical tools that are needed to understand these violations. 

Although this chapter includes many references to experiments, we re¬ 
mind the reader that, for QCD as for QED or critical phenomena, this book 
is primarily a textbook of theoretical methods rather than a review and in¬ 
terpretation of experimental data. The details of experimental techniques and 
results on strong interaction physics are reviewed in a number of excellent 
texts (see the bibliography). We hope that this chapter will give the theoret¬ 
ical foundation necessary to illuminate and interpret these results. 

17.1 From Quarks to QCD 

Our current theoretical picture of the strong interactions began with the 
identification of the elementary fermions that make up the proton and other 
hadrons. As the properties of these fermions became better understood, the 
nature of their interactions became tightly constrained, in a way that led even¬ 
tually to a unique candidate theory. In order to appreciate the uniqueness of 
this theory, we begin this chapter with a simplified history of how it arose. 

In 1963, Gell-Mann and Zweig proposed a model that explained the spec¬ 
trum of strongly interacting particles in terms of elementary constituents 
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called quarks. Mesons were expected to be quark-antiquark bound states. In¬ 
deed, the lightest mesons have just the correct quantum numbers to justify 
this interpretation; they are spin-0 and spin-1 states of odd parity, just as we 
found for fermion-antifermion bound states of zero orbital angular momen¬ 
tum in Chapter 3. Baryons were interpreted as bound states of three quarks. 
To explain the electric charges and other quantum numbers of hadrons, Gell- 
Mann and Zweig needed to assume three species of quarks, up (u), down (cl), 
and strange (s). Additional hadrons discovered since that time require the 
existence of three more species: charm (c), bottom (b), and top (t). To make 
baryons with integer charges, the quarks needed to be assigned fractional elec¬ 
tric charge: +2/3 for u , c, t, and —1/3 for d, s, b. Then, for example, the proton 
would be a bound state of uud , while the neutron would be a bound state of 
udd. The six types of quarks are conventionally referred to as flavors. 

The quark model had great success in predicting new hadronic states, and 
in explaining the strengths of electromagnetic and weak-interaction transitions 
among different hadrons. In particular, the quark model naturally incorporates 
the most important symmetry relations among strongly interacting particles. 
If one assumes that the u and d quarks have identical masses and interactions, 
the SU{2 ) group that acts as a unitary rotation of u and d. states, 

< 171 > 

should be a symmetry of the strong interactions. Indeed, both in nuclear and in 
elementary particle physics, the quantum states form multiplets of this SU(2 ) 
symmetry, called isotopic spin or isospin. Similarly, since the strange quark 
is only a little heavier than the u and d. quarks, it makes sense to consider the 
symmetry of unitary transformations of the triplet (u,d,s). Gell-Mann and 
Ne’eman showed that the elementary particles naturally fill out irreducible 
representations of this SU( 3) symmetry. 

Despite the phenomenological success of the original quark model, it had 
two serious problems. First, despite considerable effort, free particles with 
fractional charge could not be found. Second, the spectrum of baryons re¬ 
quired the assumption that the wavefunction of the three quarks be totally 
symmetric under the interchange of the quark spin and flavor quantum num¬ 
bers, contradicting the expectation that quarks, which must have spin 1/2, 
should obey Fermi-Dirac statistics. The need for this symmetry is most clearly 
illustrated in the fact that one of the lightest excited states of the nucleon is 
a spin-3/2 particle with charge +2, the A ++ . This particle is readily inter¬ 
preted as a uuu bound state with zero orbital angular momentum and all 
three quark spins parallel. 

To reconcile the baryon spectrum with the spin-statistics theorem, Han 
and Nambu, Greenberg, and Gell-Mann proposed that quarks carry an addi¬ 
tional, unobserved quantum number, called color. They introduced the ad hoc 
assumption that baryon wavefunctions must be totally antisymmetric in color 
quantum numbers. Then, if the quark wavefunctions are totally symmetric 
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in spin and flavor, they are totally antisymmetric overall, in agreement with 
Fermi-Dirac statistics. The simplest model of color would be to assign quarks 
to the fundamental representation of a new, internal SU(3) global symmetry. 
Suppressing for a moment the spin and flavor quantum numbers, we can rep¬ 
resent quarks by g,;, where i = 1, 2, 3 is the color index. Thus quarks transform 
under the fundamental, or “3”, representation of the color SU( 3) symmetry. 
Antiquarks, g* , transform in the 3 representation. The inner product of a 3 
and a 3 is an invariant of SU( 3). One can also form an invariant by using the 
totally antisymmetric combination of three 3’s, : This object transforms 

under a unitary transformation according to 

^ ijk 1 Uii 1 Ujj' Ukk 1 k 1 — ((let U) f , (17.2) 

and so is invariant under SU( 3) transformations, which have det U = 1. Under 
the postulate that all hadron wavefunctions must be invariant under SU( 3) 
symmetry transformations, these two types of combinations are the only sim¬ 
ple ones allowed: 

q'qi, e ljk qiqjqk, £ijkq'q J q k ■ (17-3) 

That is, the assumption that physical hadrons are singlets under color implies 
that the only possible light hadrons are the mesons, baryons, and antibaryons. 

Like the original quark model, the color hypothesis was phenomenologi¬ 
cally successful but raised additional questions: Why should quarks have this 
seemingly superfluous property, and what mechanism insures that all hadron 
wavefunctions are color singlets? The answers to these questions came not 
from hadron spectroscopy, but from the deep-inelastic scattering experiments 
described in Chapter 14 and the ensuing search for a theory of parton binding 
with the property of asymptotic freedom. When it was discovered that non- 
Abelian gauge theories have this property, all that remained was to identify 
the correct gauge group and fermion representation. Since the color symmetry 
had no other obvious physical role, it was natural to identify this symmetry 
with the gauge group, with the colors being the gauge quantum numbers of 
the quarks. This reasoning resulted in a model of the strong interactions as a 
system of quarks, of the various flavors, each assigned to the fundamental rep¬ 
resentation of the local gauge group SU( 3). The quanta of the SU( 3) gauge 
field are called gluons , and the theory is known as Quantum Chromodynamics, 
or QCD. 

In this book, we will mainly investigate the properties of QCD in the 
high-energy regime, where the coupling constant has become small. However, 
we should point out that one can also study QCD in the regime of strong 
coupling, using an approximation scheme introduced by Wilson in which the 
continuum gauge theory is replaced by a discrete statistical mechanical system 
on a four-dimensional Euclidean lattice. Using this approximation, Wilson 
showed that, for sufficiently strong coupling, QCD exhibits confinement of 
color : The only finite-energy asymptotic states of the theory are those that 
are singlets of color SU( 3). Thus the ad hoc assumption that explains the 
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Figure 17.1. Gauge electric field configuration associated with the separa¬ 
tion of color sources in a strong-coupling gauge theory. 

spectrum of hadrons turns out to be a consequence of the non-Abelian gauge 
theory coupling to color. If one attempts to separate a color-singlet state 
into colored components—for example, to dissociate a meson into a quark 
and an antiquark—a tube of gauge field forms between the two sources, as 
shown in Fig. 17.1. In a non-Abelian gauge theory with sufficiently strong 
coupling, this tube has fixed radius and energy density, so the energy cost of 
separating color sources grows proportionally to the separation. A force law 
of this type can consistently be weak at short distances and strong at long 
distances, accounting for the fact that isolated quarks are not observed. We 
will discuss the large-distance, strong-coupling limit of gauge theories further 
in the Epilogue. 

The short-distance limit of Quantum Chromodynamics can be readily 
studied using the Feynman diagram technology that we have developed in 
previous chapters. Here asymptotic freedom makes the coupling weak, and 
there is a sensible diagrammatic perturbation theory that begins from the 
model of free quarks and gluons. The following sections treat the elementary 
interactions among quarks and gluons that can be observed in high-energy 
experiments. 


17.2 e + e Annihilation into Hadrons 

The simplest reaction involving quarks is the production of quark pairs in 
e + e _ annihilation, a process that we treated already in Section 5.1. There we 
analyzed this process only at the most elementary level, viewing it as a pure 
QED reaction in which free quarks are created by a virtual photon. The dia¬ 
gram for this lowest-order process is shown in Fig. 17.2(a). The computation 
of the total cross section includes a sum over the various color states of the 
quark fields, and so provides a confirmation that the number of allowed colors 
is 3. Combining the color factor with the square of the quark electric charges, 
we found (Eq. (5.16)) 


a(e + e —> hadrons) = <r 0 ■ 3 ■ ^ Qj, 

f 


(17.4) 
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Figure 17.2. Diagrams contributing to the process e + e -5- hadrons in 
QCD: (a) the leading-order diagram; (b) corrections of order a s - 

where <to is the QED cross section for e + e _ —I p. + p _ , 

4na 2 


(17.5) 


and the sum in (17.4) is taken over quark flavors. This formula assumes that 
the center of mass energy is high enough that we can ignore the quark masses. 

When we couple the quarks to an SU( 3) gauge theory, we add many 
important processes that affect both the value of this cross section and the 
final states that it includes. Some of the most important effects cannot be 
discussed within the context of perturbation theory. In particular, though 
the leading diagram contains free quarks, the particles that emerge from the 
reaction are color-singlet mesons and baryons. However, we will find that QCD 
perturbation theory with quarks and gluons does make a number of important 
predictions for e + e _ annihilation to hadrons. The ideas that we develop in 
working out these predictions will also apply to many other strong-interaction 
processes. 


Total Cross Section 

The leading corrections to the rate of e + e _ annihilation due to gluon exchange 
and emission are shown in Fig. 17.2(b). These are precisely the diagrams 
computed in the Final Project of Part I. The first two diagrams give a cross 
section of order g 2 , where g is the 517(3) gauge coupling, to produce a gluon 
in addition to the quark and antiquark. The third diagram must be summed 
in the amplitude with the leading diagram to produce a correction to the 
rate of qq production without gluon emission. In Part I, we computed these 
two contributions as if the strong interactions were an Abelian gauge theory. 
To obtain the corresponding expressions in QCD, we need only multiply the 
Abelian formulae by the group theory factor 

tr[t a t a ] = C 2 (r) • tr[l] = \ • 3, (17.6) 

o 

where we have used Eq. (15.97) to evaluate Co(r) for the fundamental rep¬ 
resentation of SU( 3). The factor of 3 is the same color sum that appears in 
Eq. (17.4). Thus we can obtain the correct formulae for QCD from those of 
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the Final Project by making the replacement 

4 4 

g 2 g g\ or ->• -a*, (17.7) 

where 

= fj < 17 -8) 

is the strong-interaction analogue of the fine-structure constant. 

The end result of the Final Project of Part I was a formula for the total 
cross section to produce hadrons in e + e _ annihilation. If we replace a g with 
(4/3)a s , that result becomes 

a{e + e~ —1 hadrons) = <7 0 • ^3 ^ Qj'j • [l + — + 0(a^)J. (17.9) 

This is actually the sum of the rates for two elementary processes, e + e _ —)■ qq 
(including the correction from the third diagram of Fig. 17.2(b)) and e + e _ —> 
qqg. Although the rate for each of these processes is divergent as the gluon 
mass is taken to zero, that divergence cancels when they are combined. This 
is another example of the phenomenon of infrared divergence cancellations 
that we studied for the example of electron scattering in Sections 6.4 and 6.5. 
There we showed that the dressing of the final state by the emission of soft 
and collinear photons does not affect the overall scattering rate. Here, we see 
again that infrared divergences cancel in the total rate, although the sum over 
real and virtual gluon corrections leaves over a simple numerical correction. 

It is not difficult to understand the cancellation of infrared logarithms 
intuitively. The original process e + e _ —> qq is extremely rapid: Since the vir¬ 
tual photon is off-shell by an amount q 2 = s, the quarks are created in a time 
1/Vi. However, the emission of collinear gluons, and the virtual corrections 
associated with the exchange of soft gluons, occur over a much longer time 
scale. In the diagrams with gluon emission, the virtual quark or antiquark is 
off-mass-shell by an amount p± g , where p± g is the transverse momentum of 
the gluon relative to the qq system. Thus this virtual state survives for a time 
1 /p± g before it decays. Such a slow process cannot affect the probability that 
a qq pair was produced; it can only affect the properties of the final state into 
which the qq system will evolve. By this logic, the only perturbative correc¬ 
tions that can affect the total cross section are those for which p± g V~s- 
Another way to express this conclusion would be to argue that, after contri¬ 
butions from the infrared-sensitive regions have canceled, the contributions 
that remain come from the region of large real or virtual gluon momenta. By 
either argument, formula (17.9) should be a meaningful prediction of QCD 
perturbation theory, even though it involves an integral over the region of soft 
gluon emission. 
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The Running of a s 

Formula (17.9) depends on the QCD coupling constant a s , which must be 
defined at some renormalization point M. This is in contrast to the QED 
coupling constant, which we defined in a natural way by on-shell renormaliza¬ 
tion. In QCD, we would like to avoid discussing on-shell quarks, since these 
are strongly interacting particles that are significantly affected by nonpertur- 
bative forces. The use of an arbitrary renormalization point M allows us to 
avoid this problem. We will define a s by renormalization conditions imposed 
at a large momentum scale M where the coupling is small; this value of a s 
can then be used to predict the results of scattering processes with any large 
momentum transfer. 

However, the use of renormalization at a scale M in a computation involv¬ 
ing momentum invariants of order P 2 involves some subtlety when P 2 and 
M 2 are very different. In our discussion of Section 12.3, we saw that, in this 
circumstance, Feynman diagrams with n loops typically contain correction 
terms proportional to (a s log(P 2 / M 2 )) n . Fortunately, we can absorb these 
corrections into the lowest-order terms by using the renormalization group to 
replace the fixed renormalized coupling with a running coupling constant. 

To illustrate how this analysis applies to QCD, let us examine the impli¬ 
cations of the Callan-Symanzik equation for the e + e _ annihilation total cross 
section <r, viewed as a function of s, a renormalization scale M, and the value 
of a s at the renormalization scale. Like the QED potential (12.87), the e + e“ 
total cross section is an observable quantity and so its normalization is inde¬ 
pendent of any conventions. It therefore obeys a Callan-Symanzik equation 
with 7 = 0: 

O r\ 

[m— + 13(g)—\a(s, M ,a s ) = 0. (17.10) 

By dimensional analysis, we can write 


a = - S f{ W' as) ' ( 1711 ) 

were c is a dimensionless constant. Then the Callan-Symanzik equation implies 
that / depends on its arguments only through the running coupling constant 
a s {Q) = g 2 /47T, evaluated at Q 2 = s. The coupling constant g is defined to 
satisfy the renormalization group equation 


d 

d\og(Q/M) 


9 = 


(17.12) 


with initial condition a s (M) = a s . For QCD with three colors and n/ ap¬ 
proximately massless quarks, the (3 function is given by Eq. (16.135): 


P(g) 


bo 

(4tt) 2 ’ 


with bo = 11 — -u/. 

o 


(17.13) 
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Then the solution of the renormalization group equation is 

OL 

aAQ) = 1 + (b 0 a s /2n) \og{Q/M)' (17 ' 14) 

The explicit dependence of a on a s can be found by matching the succes¬ 
sive terms in the expansion of f(a s (\/s)) to the terms in the perturbative 
expansion. To the order of the first corrections, we find simply 


a = (To • (3 Q}) ■ [l + + +0(a;(^))] ■ (17.15) 


Thus the Callan-Symanzik equation instructs us to replace the fixed renor¬ 
malized coupling a s with the running coupling constant a s (Q), evaluated at 
Q 2 = s. 

Because the fixed coupling a s depends on the arbitrary renormalization 
point M, it is sometimes useful to remove it from our formulae completely. 
To do this, we define a mass scale conventionally called A (not to be confused 
with an ultraviolet cutoff!) satisfying 


1 = r(foo/87r 2 )log(M/A). 
Then Eq. (17.14) can be rearranged into the form 

“ s( ^ ) = 6olog(Q/A)' 


(17.16) 


(17.17) 


This formula is the clearest expression of the statement that a s {Q ) becomes 
small as (log(Q)) -1 for large Q. The momentum scale A is the scale at which 
a s becomes strong as Q 2 is decreased. 

Experimental measurements of the rate of this reaction and others yield 
a value of A ss 200 MeV. QCD perturbation theory is valid only when Q is 
somewhat larger than this, say above Q = 1 GeV, where a s (Q) ~ 0.4. The 
strong interactions become strong at distances larger than ~1/A, which is 
roughly the size of the light hadrons. 

Although the example of the e + e _ annihilation cross section is especially 
simple, since it depends on only one momentum invariant, similar conclusions 
carry over to other QCD predictions. In analyzing strong-interaction processes 
that are sensitive to the quark and gluon substructure, we will find leading- 
order formulae for the reaction cross sections that depend on the renormalized 
coupling a s . To make these expressions satisfy the Callan-Symanzik equation, 
we must replace this fixed coupling with the running coupling constant a s (Q), 
evaluated with Q of the order of the momentum invariants of the reaction. 
Since the running coupling constant depends only logarithmically on Q, we 
need not worry about choosing Q precisely. If we guess the proper scale in¬ 
correctly by a factor of 2, this induces an error in a s (Q) that is of order 
(log(Q)) -2 ~ a'l(Q). Conversely, this ambiguity would be resolved by com¬ 
puting to the next order in a s . 
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Before concluding this formal treatment of the e + e _ annihilation cross 
section, we should add one qualification. At the beginning of Section 12.2, 
we remarked that renormalization group predictions can be complicated by 
the appearance of physical thresholds and their associated singularities, and 
so we stated these predictions only for when the relevant momentum invari¬ 
ant P 2 was large and spacelike. In the present chapter, we will be concerned 
with cross sections for quark and gluon reactions, evaluated on-shell. This in¬ 
troduces additional complications of principle. For example, in order to apply 
the Callan-Symanzik equation to a(s), we needed to know that this quantity 
contains no infrared divergences whose regulator might provide another di¬ 
mensionful parameter. Throughout this chapter we will assume that similiar 
cancellations of divergences associated with soft and collinear gluons occur 
in the processes of interest to us. The complete proof of these cancellations 
in QCD can be carried through, but it is rather technical.* In some cases, 
an alternative point of view is possible; one can justify the use of the renor¬ 
malization group to analyze an on-shell amplitude by relating it to Green’s 
functions evaluated in the spacelike region. This method of analysis, which 
brings its own insights, will be the main subject of Chapter 18. 


Gluon Emission and Jet Production 


A second result of the Final Project of Part I was a formula for the differential 
cross section for qq production with gluon emission. Transcribing this formula 
to QCD using (17.7) gives the following result: Let aq, aq, £3 be the ratios of 
the quark, antiquark, and gluon energies to the electron beam energy. These 
satisfy 0 < aq < 1 and aq -l-aq +£3 = 2. Then the cross section for e + e _ —> qqg 
is given by 


da 


dx\dx- 


-(e+e ->■ qqg) = a 0 ■ 


2a s xj + xj 
3ir (1 - aq)(l - aq) 


(17.18) 


This cross section is singular as aq or aq approaches 1. The limit aq —1 1 
corresponds to configurations in which the quark has the maximum possible 
energy, while the antiquark and the gluon go off in the opposite direction, 
sharing the remaining energy. Then the antiquark and gluon have almost 
collinear lightlike momentum vectors and so form a system of very small 
invariant mass. Similarly, the limit aq —> 1 corresponds to configurations in 
which the quark and gluon are collinear. These singularities are responsible for 
the divergence of the integrated cross section in the limit of vanishing gluon 
mass. 

How should we interpret these singularities? In our general treatment 
of bremsstrahlung in Section 6.1, we saw that the emission of a photon by 


*For a review of the theorems justifying the formulae of perturbative QCD, see 
J. C. Collins and D. E. Soper, Ann. Rev. Nucl. Sci. 37, 383 (1987). 
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a scattered electron is enhanced, for collinear radiation, by a factor of or¬ 
der log (q 2 /m 2 ), where m is the mass of the electron. Thus the total rate for 
emitting a collinear photon formally diverges in the limit of zero mass. The 
same conclusion holds for the emission of gluons by quarks. A divergence that 
appears for collinear emissions in the limit of zero mass is called a mass singu¬ 
larity. In QED, we saw that the mass singularity signals a real physical effect 
of strong collinear radiation when q 2 is large. In QCD, we might expect strong 
gluon radiation in this limit, but we must think carefully about how this ra¬ 
diation appears experimentally. Whether a collinear gluon is radiated or not, 
the quark and antiquark emerging from the reaction will undergo further soft 
interactions with the other products. These processes must continue, produc¬ 
ing quark-antiquark pairs and emitting and absorbing gluons, until all colored 
particles are collected into color-singlet hadrons. Thus the presence of one or 
more collinear gluons will have no noticeable effect on the final state, which 
consists of two back-to-back jets of hadrons. For this reason, formula (17.18) 
is of no use when the gluon transverse momentum is less than the typical scale 
of soft gluon interactions, roughly 1 GeV. 

When the gluon is emitted with substantial transverse momentum with 
respect to the qq axis, however, it is not possible for subsequent soft exchanges 
to recall or reverse this transverse momentum. In this case, the qqg system 
evolves into a system of three distinct jets of hadrons. Thus, sufficiently far 
from the collinear regions, we can interpret Eq. (17.18) as the cross section 
for producing events with three hadronic jets having energies aq, x- 2 , re 3 times 
the electron beam energy. 

By an analysis similar to that given above for the total cross section, we 
can improve Eq. (17.18) by replacing the fixed coupling constant a s with a 
running a s (Q). A reasonable choice for Q is the transverse momentum of the 
gluon, p± g , described below Eq. (17.9). If this transverse momentum is too 
small, however, a s (Q ) will be large, and our leading-order formula will break 
down. This is a second reason why we cannot use formula (17.18) when the 
transverse momentum of the gluon is less than about 1 GeV. 

On the other hand, when the gluon transverse momentum is much larger 
than 1 GeV, there is no reason to distrust QCD perturbation theory. Soft pro¬ 
cesses cannot disturb the three-jet nature of the hadronic state, and asymp¬ 
totic freedom insures that the coupling constant is small, so that the leading 
order of perturbation theory will be a good approximation. 

The three-jet cross section (17.18) is a good example of the type of pre¬ 
diction that one obtains from the use of perturbation theory in QCD. We 
describe a strong-interaction process by the invariant momentum transfer Q 
it gives to hadronic constituents. QCD perturbation theory makes predictions 
about the flow of energy and momentum in such a reaction into the final sys¬ 
tem of jets of hadrons. If Q is small, perturbation theory is invalid, and we 
obtain no useful prediction. However, if Q is large, the asymptotic freedom 
of QCD implies that Feynman diagrams for quarks and gluons will correctly 
predict the behavior of the final system of hadronic jets. 
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Figure 17.3. Deep inelastic scattering in QCD. Tlie diagram shows the flow 
of momentum when a high-energy electron scatters from a quark taken from 
the wavefunction of the proton. 


IT.3 Deep Inelastic Scattering 

After e + e _ annihilation into hadrons, the next simplest reaction involving 
strongly interacting particles is electron scattering from a proton, or from 
some other hadron. At the most elementary level, this reaction can be viewed 
as the electromagnetic scattering of an electron from a quark inside the pro- 
tond A way to visualize the process is shown in Fig. 17.3. Call the proton 
momentum P, and the initial quark momentum p. Call the initial and final 
momenta of the electron k and k'. If the final electron momentum is measured, 
one can deduce the momentum q = k — k' transferred by the virtual photon to 
the hadronic system. The vector q is spacelike, and one conventionally writes 
q~ = -Q 2 . 

If the invariant momentum transfer Q 2 is large, the quark is ejected from 
the proton in a manner that cannot be balanced by subsequent soft processes. 
These soft processes will, however, create gluons and quark-antiquark pairs 
that eventually neutralize the color and cause the struck quark to materialize 
as a jet of hadrons in the direction of the momentum transfer from the elec¬ 
tron. Typically the total invariant mass of the final hadronic system is large, 
since the struck quark carries a large momentum with respect to the other 
“spectator” quarks. In this case, the process is referred to as deep inelastic 
scattering. 

To derive a first approximation to the cross section for electron-proton 
scattering, we consider this reaction from a frame in which the electron and 
proton are moving rapidly toward each other, for example, the electron- 
proton center-of-mass frame. We assume that the center-of-mass energy is 
large enough that we can ignore the proton mass in working out the kine¬ 
matics. Then the proton has an almost lightlike momentum along the colli¬ 
sion axis. The constituents of the proton also have lightlike momenta, which 

iThe electron could just as well be a muon; all tlie same formulae apply in this 
case. Leptons can also scatter from quarks via the neutral-current weak interaction, 
as we will see in Chapter 20. 
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are almost collinear with the momentum of the proton. This is because a 
constituent cannot acquire a large transverse momentum except through ex¬ 
change of a hard gluon, a process that is suppressed by the smallness of a s at 
large momentum scales. Thus, to leading order in QCD perturbation theory, 
we can write 

P = £P, (17-19) 


where £ is a number between 0 and 1, called the longitudinal fraction of 
the constituent. To leading order in a s we can also ignore gluon emission or 
exchanges during the collision process. The cross section for electron-proton 
scattering is then given by the cross section for electron-quark scattering at 
given longitudinal fraction £, multiplied by the probability that the proton 
contains a quark at that value of £, integrated over £. 

But the probability that the proton contains a certain constituent with 
a certain momentum fraction cannot be computed using QCD perturbation 
theory, since it depends on the soft processes that determine the structure of 
the proton as a bound state of quarks and gluons. We will therefore consider 
this probability to be an unknown function, to be determined from experi¬ 
ment. Eventually, we will need to make use of such a probability function for 
each species of quark, antiquark, and gluon that can be found in the wave- 
function of the proton. Collectively, these constituents are called partons. For 
each parton species /, we write the probability of finding a constituent of the 
proton of type / at longitudinal fraction £ as 


/ probability of finding constituent / 
\ with longitudinal fraction £ 


= f f (m- 


(17.20) 


The functions //(£) are called the parton distribution functions. Using this 
notation, the cross section for electron-proton inelastic scattering is given to 
leading order in a s by the expression 


a(e~(k)p(P) -y e (k') + \) 

l 

= [ d£,^2ff(0v{e-(k)q f (£,P) ->■ e~(k') + q f (p')), 
o f 


(17.21) 


where X stands for any hadronic final state. The sum in (17.21) contains 
contributions from constituent antiquarks as well as constituent quarks. 

Equation (17.21) is equivalent to the formula (14.8) that we constructed 
for this reaction in Chapter 14. Now we see that this formula is justified by 
the smallness of the QCD coupling constant at large momentum scales. It 
is important to remember, however, that (17.21) is not the complete pre¬ 
diction of QCD, but only the first term of an expansion in a s ; this level of 
approximation is called the parton model. The higher-order QCD corrections 
to Eq. (17.21) will involve modifications both to the electron-quark scattering 
cross section and to the parton distribution functions. The most important of 
these corrections are discussed in Section 17.5. 
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In the same way, all other reactions of the proton that involve large mo¬ 
mentum transfer also have parton model descriptions. In QCD, all of these 
reaction cross sections are computed from scattering amplitudes for quarks 
and gluons. The initial motion of the partons for any process is described 
by the same parton distribution functions //(£) that appear in deep inelastic 
scattering. 

Let us now work out the explicit leading-order formula for the deep inelas¬ 
tic scattering cross section, reviewing the analysis in Chapter 14. In Eq. (14.3), 
we wrote the leading-order differential cross section for the parton-level pro¬ 
cess, 


do _ . 2Tra 2 Q' 2 f 

-j(e q^e q) = - 

dt s- 


s 2 + u 2 


t 2 


(17.22) 


In general, we will use the symbols s, t , u to denote the Mandelstam variables 
for two-body scattering processes at the parton level. These variables must 
be related to observable properties of the hadronic system or the scattered 
electron. For massless initial and final particles, 

8 + t + U = 0. 


In the case of deep inelastic scattering, 


i = - Q 2 


and 


s = 2p ■ k = 2£P ■ k = £s. 

Thus the cross section for deep inelastic scattering at fixed 

l 


do 

dQ 2 


o f 


‘Itto 2 


i + 



Q 2 is given by 
Q 2 ). (17.23) 


The final factor expresses the kinematic constraint s > |i|. Expression (17.23) 
should be an accurate first approximation to the deep inelastic scattering cross 
section when Q 2 is large. In that case, the corrections to this formula from 
hard gluon emissions and exchanges will be of order a s (Q 2 ). 

We also showed in Chapter 14 that the measurement of the scattered 
electron momentum k' and thus the momentum transfer q uniquely determines 
an allowed value of £ for elastic electron-quark scattering. This value is given 
by Eq. (14.7): 


£ = x, 


where 


x = 


Q ' 2 

2 Pq 


(17.24) 


When (17.23) is expressed as a doubly differential cross section in x and Q 2 , 
it becomes the simple product of a parton-level cross section and a sum of 
parton distribution functions evaluated at £ = x. In the literature, the symbol 
x is often used interchangably with £, and we will follow this practice from 
here on. 
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It is especially convenient to represent the cross section in terms of di¬ 
mensionless combinations of kinematic variables. One of these should be a;; a 
good choice for the other is 


_ 2 P-q 2 P-q 

V = 2 P-k ~ s 

In the frame in which the proton is at rest, 



(17.25) 


(17.26) 


that is, y is the fraction of the incident electron’s energy that is transferred 
to the hadronic system. On the other hand, since p = £P, we can evaluate y 
in terms of parton variables: 


2 p ■ (k — k') s + u 

2 p-k s 


(17.27) 


so that 

T = -(l-!/). (17.28) 

s 

From Eq. (17.26) or (17.28), we see that y < 1. The kinematically allowed 
region in the (. x,y ) plane is thus 


0 < x < 1, 0 < y < 1. 


To express Eq. (17.23) in terms of x and y, we need the formula 


Q 2 = 


(17.29) 


which follows from Eqs. (17.24) and (17.25), and the change of variables 


dQ- 

d£ clQ 2 = clx dQ 2 = ——dx dy = xs dx dy. 

dy 

Then the differential cross section becomes 

d 2 u , _ _ /v—' „ , 27T ors 


dxdy 


(e p^e X) = (^2 xff($)Q 2 f) -^pr- [1 + (1 - y) 2 ] ■ (17.30) 

/ J 


The factor 1/Q 4 comes from the square of the virtual photon propagator. 
Once this factor is removed, the dependence on x and y completely factor¬ 
izes. Each half of this relation contains physical information. The fact that 
the parton distributions ff(x) depend only on x and are independent of Q 2 
is the statement of Bjorken scaling. This tells us that the initial distribution 
of partons is independent of the details of the hard scattering. The y depen¬ 
dence of the cross section comes from the underlying parton cross section. 
In Chapter 5, we saw that the elementary QED cross sections, viewed in the 
high-energy limit, reflect the helicities of the interacting particles. The behav¬ 
ior [1 + (1 — «/) 2 ] in (17.30) is known as the Callan-Gross relation ; it is specific 
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to the scattering of electrons from massless fermions. This relation gave evi¬ 
dence that the partons involved in deep inelastic scattering were fermions, at 
a time when the relation between partons and quarks was still unclear. 


Deep Inelastic Neutrino Scattering 


Because the sum over quark flavors factorizes in Eq. (17.30), one cannot de¬ 
termine the individual parton distribution functions // (x) from electron scat¬ 
tering experiments alone. One can, however, obtain more detailed information 
on the structure of the proton through deep inelastic neutrino scattering. 

Neutrinos have zero electric charge and so do not interact by photon ex¬ 
change, but they do interact with quarks through weak interactions. We will 
discuss the weak interactions in detail in Chapter 20; for the moment, let 
us adopt a simplified description that concentrates on the elementary pro¬ 
cess shown in Fig. 17.4. In this process, a neutrino converts to the associated 
charged lepton,+ exchanging a virtual massive vector boson, the W + . This 
boson couples to a quark current that converts a d quark to a u quark. The 
effect of this exchange process is to provide a different, but completely char¬ 
acterized, method for injecting a hard momentum transfer q. The amplitude 
for this process is described by the effective Lagrangian 

where £, v , d, u are the fermion fields associated with the charged lepton, 
the neutrino, and the d and u quarks, and g is the weak interaction coupling 
constant. The factor 1 /m\ v comes from the W boson propagator, considered 
in the limit q 2 <C rn\ v . The first two factors are often written in terms of the 
Fermi constant Gf, defined as 


Gf 

72 


r 


8ni'w ’ 


(17.32) 


This constant gives the strength of the weak interactions at energies much less 
than mw- The crucial property of the weak interactions, shown explicitly in 
(17.31), is that the W boson couples only to the left-handed helicity states of 
relativistic fermions. The deeper significance of this property will be discussed 
in Chapter 20. 

For technical reasons, it is easiest to do neutrino deep inelastic scattering 
using muon neutrinos, which convert to muons after emitting a W boson. It is 
equally feasible to scatter muon antineutrinos from nuclear targets, and, as we 
will see, it is interesting to compare the effects of neutrinos and antineutrinos. 
Since the proton contains a small admixture of the heavier quarks s, c, these 
also give small contributions to neutrino deep inelastic scattering. However, 
we will ignore these contributions in our discussion. 


+ There is also a neutral-current weak interaction in which the neutrino remains a 
neutrino; see Problem 20.4. 
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Figure 17.4. The elementary neutrino scattering process mediated by the 
weak interaction. 


The cross section for neutrino deep inelastic scattering is given by a for¬ 
mula analogous to (17.21), with the photon-exchange cross section replaced 
by one resulting from W exchange. It is straightforward to work out this cross 
section directly. However, we can also obtain the result from Eq. (17.22), if 
we look back to Chapter 5 and recall how the structure of this equation arises 
from the various helicity contributions. In (17.22), the factor t 2 in the denom¬ 
inator came from the photon propagator; this factor is replaced by m\ v in the 
weak interaction case. The factor [s 2 + it 2 ] came from the Dirac matrix alge¬ 
bra. We saw in Section 5.2 that the first term is the contribution of left-handed 
electrons scattering from left-handed fermions or right-handed electrons scat¬ 
tering from right-handed fermions, and that the second term arises from the 
other helicity combinations. For the case of neutrino-quark scattering, the 
interaction (17.31) allows only the scattering of left-handed neutrinos from 
left-handed quarks, so only the s 2 term appears. To determine the overall 
normalization of the cross section, we note that, since the neutrinos are pro¬ 
duced in weak interactions, they always have left-handed polarization, so no 
polarization average should be done. On the other hand, we must still average 
over the polarization of the initial quark. In all, we find 


dcr 7T<7 4 

“ ) = w 


w 


Gl 


(17.33) 


It is easy to check this formula by explicit computation starting from (17.31). 

The cross section for the scattering of antineutrinos from quarks can be 
worked out in the same way. Note that this reaction involves the exchange 
of a W~, and so converts u quarks to d quarks. However, the u quarks must 
still be left-handed. The only modification from the previous paragraph comes 
in the fact that the antineutrinos that couple to the interaction (17.31) are 
right-handed, so the cross section comes from the term in (17.22) proportional 
to it?: 


do , 7 rg 4 

li { -" J> = asp 


w 


r ' 2 

= ^d -yf- 

7T 


(17.34) 


Again, it is easy to verify this formula directly. The cross section for neutrino- 
antiquark scattering, converting a u into a d, is also given by Eq. (17.34), 
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Figure 17.5. The distribution in y of neutrino and anti-neutrino deep in¬ 
elastic scattering from an iron target, as measured by the CDHS experiment, 
J. G. H. de Groot, et. al., Z. Phvs. Cl, 143 (1979). The solid curves are fits 
to the form A + B(l—y) 2 . 


while the cross section for antineutrino-antiquark scattering, converting a d 
into a u, is given by Eq. (17.33). 

To convert these parton-level cross sections to physical cross sections, 
we combine them with the parton distribution functions. The kinematics is 
exactly the same as in the case of electron scattering. Thus, following the 
arguments that led to Eq. (17.30), we obtain the expressions 


-t p X) = [xf d (x) + xfu(x) • (1 - |/) 2 ], 
axay 7r 

P +x ) = [xfu(x) ■ (1 - yf + xf 3 {x)\. 


(17.35) 


According to these relations, deep inelastic neutrino scattering allows one 
to map separately the parton distribution functions for u and d quarks and 
antiquarks in the proton. 

In addition, Eq. (17.35) makes a dramatic qualitative prediction: To the 
extent that a proton (or neutron) is made of quarks with very few addi¬ 
tional quark-antiquark pairs, the deep inelastic neutrino scattering cross sec¬ 
tion should be constant in y, while the antineutrino scattering cross section 
should fall off as (1 — y) 2 . The measured y dependence of these deep inelas¬ 
tic cross sections is shown in Fig. 17.5. The qualitative behavior predicted by 
the parton description is clearly evident; the discrepancy from the strict pre¬ 
diction can be accounted for by a small antiquark component in the nucleon 
wave function. 
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The Parton Distribution Functions 

Given that the parton model predictions for electron and neutrino deep inelas¬ 
tic scattering do fit the data, one can make use of these relations to extract 
the parton distribution functions and so learn something about the structure 
of the proton.* A set of distribution functions, chosen to fit all available data, 
is shown in Fig. 17.6. Since all of these distributions, especially those for anti¬ 
quarks, peak sharply at small x, we have plotted xff(x) for each species. As 
we remarked in Chapter 14, a small violation of Bjorken scaling is observed 
experimentally, so that these distribution functions change slowly with Q 2 . 
The figure shows these functions at Q 2 = 4 GeV 2 . We will see in Section 17.5 
that this violation of Bjorken scaling is an effect of higher-order corrections 
in QCD; we will also argue that the measurement of this scaling violation 
allows one to determine the parton distribution function for gluons, f g (x). 
Anticipating this result, we have also plotted this function in the figure. Not 
surprisingly, one finds that the u and d quarks are most likely to carry a sub¬ 
stantial fraction of the proton’s momentum, while antiquarks and gluons tend 
to have small longitudinal fractions. 

Since the parton distributions are the probabilities of finding various pro¬ 
ton constituents, they must be normalized in a way that reflects the quantum 
numbers of the proton. The proton is a bound state of uud, plus some ad¬ 
mixture of quark-antiquark pairs. Thus it should contain an excess of two u 
quarks and one d quark over the corresponding antiquarks. These considera¬ 
tions imply the constraints 

i l 

J dx[f u (x ) - f u (x )] = 2, J dx[f d (x ) - f d (x)\ = 1. (17.36) 

0 0 

So far we have discussed the parton distributions only for the proton. 
Similar considerations, however, apply to any other hadron. Each hadron has 
its own set of parton distribution functions; these obey sum rules analogous 
to (17.36) but reflecting the particular quantum numbers of the hadron. The 
parton distribution functions should also reflect the symmetries that link dif¬ 
ferent hadrons. For example, since the neutron can be generated, to a few 
percent accuracy, by interchanging the role of u and d quarks in the proton, 
its distribution functions obey 

fu( x ) = fd(x), f2( x ) = U(x), f£(x) = f d , etc. (17.37) 

In these equations, and henceforth, a distribution function without a special 
label refers to the proton. The parton distribution functions of the antiproton 
are given by the exact relations 

fu( x ) = fn(x), fl(x)=f u ( x), etc. (17.38) 


*A detailed discussion of tlie extraction of parton distribution functions from data 
can be found in G. Sterman, et. al., Rev. Mod. Phys. 67, 157 (1995). 
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Figure 17.6. Parton distribution functions xff(x) for quarks, antiquarks, 
and gluons in tlie proton, at Q 2 = 4 GeV 2 . These distributions are obtained 
from a fit to deep inelastic scattering data performed by the CTEQ collabora¬ 
tion (CTEQ2L), described in J. Botts, et. ah, Phvs. Lett. B304, 159 (1993). 

In any case, the total amount of momentum carried by the partons must 
sum to the total momentum of the hadron. This implies 

l 

J dxx[fu(x) + f d (x) + fu{x) + f a (x) + fg(x)] = 1. (17.39) 

0 

The distribution functions of quarks and antiquarks in the proton, as extracted 
from deep inelastic scattering data, contribute only about half of the total 
value required for this integral. The remaining energy-momentum must be 
carried by the gluons. 

17.4 Hard-Scattering Processes in Hadron Collisions 

If one collides hadrons with other hadrons at very high energy, most of the 
collisions will involve only soft interactions of the consituent quarks and glu¬ 
ons. Such interactions cannot be treated using perturbative QCD, because 
a s is large when the momentum transfer is small. In some collisions, how¬ 
ever, two quarks or gluons will exchange a large momentum p± perpendicular 
to the collision axis. Then, as in deep inelastic scattering, the elementary in¬ 
teraction takes place very rapidly compared to the internal time scale of the 
hadron wave functions, so the lowest-order QCD prediction should accurately 
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describe the process. Again, we should find a parton model formula that is 
built from a leading-order subprocess cross section, integrated with parton 
distribution functions. For the case of proton-proton scattering, these func¬ 
tions will be the same ones that are measured in lepton-proton deep inelastic 
scattering. 

For example, if the hard parton-level process involves quark-antiquark 
scattering into a final state V', the leading-order QCD prediction takes the 
form 


<r(p(Pi)+p(P2)^Y + X) 

1 1 

= I d xl / clx -2 ff(xi)ff(x 2 ) • cr(qf{xiP) + q^xiP) —> 1 ), 


(17.40) 


where the sum runs over all species of quarks and antiquarks— u, d, u, d, .... 
(Here again, X denotes any hadronic final state.) The same formula, with 
appropriately modified distribution functions, applies to any other hadron- 
hadron collision. This formula will be a good first approximation if, by some 
invariant measure, a large momentum is transfered in the qq reaction. In this 
section we will discuss several examples of processes of this type. 


Lepton Pair Production 

The simplest example to analyze is the reaction in which a high-mass lepton 
pair ( + (~ emerges from qq annihilation in a proton-proton collision. This 
reaction, called the Drell-Yan process , is illustrated in Fig. 17.7. In this case, 
the underlying qq reaction is described by an elementary QED cross section. 
To the leading order in QCD, the cross section that we require, a(qq —> £ + (~), 
is simply related to the cross section cr(e + e - —)■ qq) given in (17.4). The only 
difference between the two calculations is that we must average rather than 
sum over the color orientations of the quark and antiquark. This gives two 
extra factors of 1/3. Thus, 

1 47TCV 2 

a(q f q f ^ £+£-) =-Q}.—. (17.41) 

If both final-state lepton momenta are observed, it is possible to recon¬ 
struct the total 4-momentum q of the virtual photon. It is also possible to 
determine the longitudinal fractions of the initial quark and antiquark, as we 
will now show. Let 

M 2 = q 2 (17.42) 

be the square of the invariant mass of the Drell-Yan pair. (Do not confuse 
this quantity M with the renormalization scale.) If the initial partons have 
small transverse momentum, the transverse momentum of the virtual photon 
will also be small. Its longitudinal momentum, however, will in general be 
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Figure 17.7. The Drell-Yan process: pp —S- f + ( + anything. 


substantial. We parametrize this using the rapidity, Y, of the virtual photon, 
as defined in Eq. (3.48): 

q° = M cosh 1", (17.43) 

where q° is measured in the pp center of mass frame. We will express the 
longitudinal fractions of the quarks, and hence the Drell-Yan cross section, in 
terms of the observables M 2 and Y. 

In the pp center of mass frame, the proton momenta take the explicit 
forms 

P 1 = (E,0,0,E), P 2 = (E,0,0,-E), 

where E satisfies s = 4E' 2 . Ignoring their small transverse momenta, we can 
write the constituent quark and antiquark momenta as x\ and x -2 times these 
vectors, so that 


q = x\P\ + x -2 P -2 = ((xi+X' 2 )E, 0, 0, (xi—XzjE). 
By computing the invariant square of this vector we find 

M 2 = XiXn s. 

Similarly, comparing (17.43) with (17.44), we find 

, Xl+ X-2 1 

cosh 1 = = - 

2^/xix 2 2 

which implies 


— + 

X-2 


v Xi 

expl = ,/—. 


These equations can be inverted to determine x\ and x- 2 - 


M 

x\ = —^e- 


Y 


M 

x -2 = 




(17.44) 

(17.45) 


(17.46) 


(17.47) 


Relations (17.45) and (17.46) let us convert the integral in Eq. (17.40) into 
an integral over the parameters M 2 , Y of the produced leptons. The Jacobian 
of the change of variables is 


d{M 2 ,Y ) 

d(X! ,x 2 ) 


X-2 S X\S 

1 / 2.X ! - 1 / 2 x 2 


AP 

X\X-2 
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The cross section for lepton pair production is therefore 

tfW (g)4eV+I) = ^2x 1 f f (x 1 )x- 2 f f -(x 2 ) ■ ^Qj ■ (17.48) 

/ 

where x\ and x 2 are given by Eq. (17.47). It is remarkable that the cross 
section for the Drell-Yan process is determined point by point by information 
derivable from deep inelastic scattering. Unfortunately, the relation between 
the two processes implied by (17.48) receives a correction of order a s (M ) that 
turns out to be numerically large, and which must be included to check this 
prediction against experiment. 

General Kinematics of Pair Production 

In deriving Eq. (17.48), we used the total cross section (17.41) for the parton- 
level process, integrated over the angular distribution of the outgoing leptons. 
In principle, we could have retained the angular information and derived a 
triply differential distribution. This would be the most complete prediction 
possible for a two-body parton-level reaction. It will be useful to work out the 
kinematics of such reactions, taking a more general viewpoint. In the generic 
situation, a parton of type 1 from proton 1 scatters from a parton of type 2 
from proton 2, yielding partons of types 3 and 4, with a squared momentum 
transfer i. This generic process is shown in Fig. 17.8. In the Drell-Yan process, 
partons 3 and 4 are leptons. But these partons could also be quarks or gluons, 
which materialize as hadronic jets. We assume that all partons can be treated 
as massless. In parton variables, the cross section for this process is 

, d3 ° A pp 3 + 4 + X) = Al + 2^3 + 4). (17.49) 

dxidx^dt dt 

Let us now translate this formula to observable parameters of the final state. 

In the leading order of QCD, the transverse momenta of partons 3 and 4 
must be equal and opposite, but their longitudinal momenta are not con¬ 
strained. We will take the three parameters of the final state to be the com¬ 
mon magnitude of the parton transverse momenta p± and the longitudinal 
rapidities y 2 , y.\ of the final-state partons, defined by the formulae 

E, = p ± cosh y ,: = Px. sinh y ,;. (17.50) 

The longitudinal rapidity yi gives the boost of the particle i from the frame 
where it has zero longitudinal momentum.1 Recall from Section 3.3 that ra¬ 
pidities simply add under collinear boosts. The transverse momentum is in¬ 
variant under longitudinal boosts. Thus, (y 3 ,y,i,p±) is a set of variables with 
convenient Lorentz transformation properties with respect to boosts along the 


lln tlie literature on hadron collisions, i/,,; is usually called simply the rapidity, 
with the restriction to longitudinal boosts being understood. 
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Figure 17.8. A generic two-body parton scattering process. 

collision axis. We will now see that these three parameters are related in a 
straightforward way to the underlying variables x\, xo, t. 

Consider the center of mass frame of the colliding partons. The total 
energy in this frame is s/1. Let us use a subscript * to denote other quantities 
measured in this frame, for instance, #* for the parton scattering angle. Then 

/J 3 ||* = i-x/Jcos^*, p 3 ±* = I'/isin#*, (17.51) 

and p 4 * is oriented just oppositely. This frame is also the center of mass frame 
of partons 3 and 4, so 

y 3 * = -y 4 * = y*- (17.52) 

Since rapidities transform by shifts, we can solve for y* and for the rapidity 
V by which one must boost to reach this frame: 

V* = 4 ( 1/3 - 2 / 4 ), Y = biya + 2 / 4 ). ( 17 . 53 ) 

The scattering angle 6 * is determined from y* by combining (17.51) with the 
relation £1* = p± cosh y*: 

——- = coshy*. (17.54) 

sm(9* 

Then the Mandelstam variables 

A a 

s = —2 = —^s(l - cos#*) (17.55) 

sin” #* 

can be expressed as 

s = 4p 2 ± cosh 2 y*, t = —2 p]_ cosh y* e~ v *. (17.56) 

We can combine the first of these expressions with (17.47) to determine x\ 
and x- 2 - 

x\ = coshy* e y , x-s = coshy* e _1 . (17.57) 

\/s ' V s 

To translate the cross section (17.49) to the final parton observables, we 
need the Jacobian 

d{xi ,x 2 ,i) 8 p\ 2 2p .ls 

- 7 = cosh y* =- 

0 ( 2 / 3 ,2M, P_l) s 


s 


(17.58) 
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Multiplying Eq. (17.49) by this factor gives 


d 3 a 

d:yzd:yidp±_ 


fi{xi)h{x- 2 ) —— ^ (1 + 2 ->• 3 + 4). 
s dt 


(17.59) 


This can be simplified a bit using the relations s = x\X 2 s and p±dp± = 
d 2 p±/2iT, yielding the final result: 


d 4 a 

dy 3 dy4d' 2 p± 


Xifi(xi)x-2f 2 (x-2) (1 + 2 ->• 3 + 4). 
TT dt 


(17.60) 


In this formula, xi , x 2 , and the Mandelstam variables of the parton subprocess 
are given by Eqs. (17.57) and (17.56). 

This result gives us the complete distribution of final-state leptons or jets 
in any two-body reaction of partons. For example, to find the distribution of 
final-state leptons in the Drell-Yan process, we would insert into this formula 
the differential cross section for quark annihilation into leptons, 


I 1 "' 5 ' f+r) 


1 o 'lira 2 t 2 + u 2 
3 Q 'f ■ 12- JT- 


(17.61) 


The formula can be applied equally well to other two-body parton reactions, 
if we know the relevant parton-level differential cross sections. 


Jet Pair Production 

The most common two-body parton reactions are those of QCD, involv¬ 
ing quarks, gluons, or both. Unfortunately, it is very difficult to distinguish 
hadronic jets initiated by gluons from those initiated by quarks. It is even more 
difficult to determine experimentally whether the initial partons in a hard- 
scattering process were quarks or gluons. Thus, the predictions of QCD for 
hard-scattering processes are most often quoted as cross sections for jet pro¬ 
duction in hadronic collisions, summing over all possible reactions of quarks, 
antiquarks, and gluons. In any event, to derive these predictions, we must 
work out the basic parton-parton cross sections. 

The simple two-body scattering processes of quarks, antiquarks, and glu¬ 
ons are the elementary processes of QCD perturbation theory, in the same 
sense that the reactions studied in Chapter 5 are the elementary processes of 
QED perturbation theory. They are the basic hadronic hard-scattering reac¬ 
tions that appear in QCD at the leading order in a s . In the remainder of this 
section, we will write down the cross section formulae for the various possible 
quark and gluon subprocesses. All of these cross sections will be of order a^. 
In practice, this a s should be evaluated at a typical momentum transfer of 
the reaction, for example, at Q 2 = t. 

The simplest subprocess is the scattering of different species of quarks, 
for example, u + d —> u + d. At order cq, this process occurs via the Feyn¬ 
man diagram shown in Fig. 17.9. This process is analogous to electron-muon 
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Figure 17.9. Feynman diagram contributing to ud — 5 - ud. 


scattering in QED, for which we wrote the cross section in Eq. (17.22): 

(17.62) 


da _ 2na 2 

— r (e n , ->■ e fj.) = -r- 
dt s z 


s 2 + u 2 


t 2 


To convert this to the cross section for quark scattering in QCD, we need only 
replace the QED coupling e 2 by g 2 times an SU( 3) group theory factor. The 
QCD diagram contains the factor 


(t a hdt a ) 


J J! 


where i, i' are the initial and final colors of the it quark and j, j' are the initial 
and final colors of the d quark. To compute the cross section, we must square 
this factor, sum over final colors, and average over initial colors. This gives 
the factor 

1 1 • tr [t b t a ] ■ tr [t b t a ] = l [C(r)] V«S“‘ = I • I • 8 = | (17.63) 

where we have used Eq. (15.78) and C(r) = 1/2 for the fundamental repre¬ 
sentation of SU( 3). Thus for ud scattering, 


da 47ra: 

—-(ud —> ud) = — 
dt J 9s 2 


s 2 + u 2 


t 2 


(17.64) 


The same formula applies for the scattering of any two different quarks, 
or, by crossing, to the scattering of a quark and an antiquark of a different 
species. Crossing from the t to the s channel gives the cross section for qq 
annihilation into a different species: 


da 

di 


(uu —>- dd) 


4ira 
~9F 2 




(17.65) 


The scattering of a quark with an antiquark of the same species is more 
complicated, since now there are two Feynman diagrams, shown in Fig. 17.10, 
which interfere with one another. The analogous QED process is Bhabha 
scattering, e + e _ —> e + e _ , for which we worked out the cross section in Prob¬ 
lem 5.2: 



> e + e ) 





(17.66) 
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Figure 17.10. Feynman diagrams contributing to uu -> uu. 


However, it is not quite straightforward to transcribe this to QCD, because 
different terms receive different color factors. 

This process is most easily analyzed using initial and final states of def¬ 
inite helicity. For massless fermions, helicity is conserved, so the reaction 
e R e Z e £ e R can receive a contribution only from the s-channel diagram, 

while —» e R e R can receive a contribution only from the i-channel dia¬ 

gram. The corresponding cross sections are 


uu i + - + -\ 

-jjii e R e L -> e L e R> 

da j —i — . 

—;( e R e R -> e R e R ) 




(17.67) 


The cross section for e R e~[ —> must vanish. The fourth possible pro¬ 

cess involving e R receives contributions from both s- and f-channel diagrams. 
Computing this contribution explicitly, one finds 


da 

di 


^(ete, 


J R^L 


~^ e R e L) ~ 



(17.68) 


the cross term in the square is the interference term between the two diagrams. 
The invariance of QED under parity implies that the values of all of these cross 
sections remain identical when all helicities are reversed. It is easy to check 
that the spin-averaged cross section is indeed given by (17.66). 

To convert Eq. (17.66) to a QCD cross section averaged over colors, we 
can assign the color factor (17.63) to the square of any individual diagram. 
However, the cross term between the two diagrams in Fig. 17.10 receives a 
different color factor: 


(^) ' {t a )i'dt a )jf ' = ^tr [t a t b t a t b ]. (17.69) 


To evaluate this factor, we can make use of Eq. (16.79): 


= (ft W - lft(G))«- = (| -1)1 = -§. 


So the color factor (17.69) equals —2/27. 
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Figure 17.11. Feynman diagrams contributing to qq —> gg. 


Assembling the color factors and the helicity cross sections, we find the 
following result for the uu scattering cross section: 


da Ana 2 

—-(.UU —> uu) = --r 2 - 

dt J 9s 2 


r s 2 + u 2 


t 2 


+ 


t~ + U- 


2 u 2 

3 st 


(17.70) 


By crossing between the s and u channels, we find the corresponding cross 
section for uu —» uu: 


da . 47ra 2 

—-(uu —>• uu) = --r- 

dt ’ 9s 2 


u 2 + s ' 2 


t 2 


+ 


t~ + s~ 


2F 

3 ui 


(17.71) 


The process uu —> uu has the same cross section. This completes our catalogue 
of cross sections for the scattering of quarks and antiquarks. 

We turn next to processes that involve both quarks and gluons. We will 
begin with the reaction qq — » gg. This is the analogue of the QED annihilation 
of e + e _ to 77 , discussed in Section 5.5. The QED cross section is 


da , _ 
-j(e + e —> 77 ) 
at 



u t 

— H - ~ 

t u 


(17.72) 


Since the photons are identical particles, this expression should be integrated 
over only half of the 47 T solid angle. 

The QCD reaction is considerably more complicated. As we saw in Sec¬ 
tion 16.1, this process receives contributions from three Feynman diagrams, 
shown in Fig. 17.11. These contributions must be summed over the transverse 
polarization states of the gluons. If one chooses instead to evaluate the sum 
over gluon polarizations by the replacement 


-S'*", (17.73) 

€ 

we saw in Section 16.3 that one must also include the (negative) cross section 
for qq annihilation to a ghost-antighost pair. 

The leading behavior of the qq —> gg cross section as t or u —» 0 is not so 
hard to evaluate. In either case, only the single diagram with the corresponding 
kinematic singularity contributes. The color factor associated with the square 
of either of these diagrams is the square of 

( t a )ij(t b )jk, 
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Figure 17.12. Feynman diagrams contributing to gg —S- gg. 


summed over the gluon colors a, b and averaged over the q and q colors i, k. 
This gives 

(i) 2 .tr[t a t b tT] = ^ -3(C 2 (r)) 2 = g. (17.74) 

Thus the most singular terms are given by the QED result, with a replaced 
by a s , multiplied by 16/27. The complete evaluation of the cross section is 
left for Problem 17.3; the result is 




327ra 2 I "u t 9 /t, 2 +u 2 \ 
27s 2 l + 4 V“ ) 


(17.75) 


The cross sections for the remaining quark-gluon processes can be ob¬ 
tained from this result by crossing. The result for the inverse reaction gg —t qq 
involves the same squared matrix element as (17.75); the only difference is 
that we average over gluon rather than quark colors, giving a relative factor 
of (3/8) 2 . Thus, 


qq) 

at 


Tra 2 

u 

t 

+ - 

u 

--I 

(t 2 + it, 2 \ 

6 s 2 

_ t 

4* 

V s 2 ) 


(17.76) 


For the reaction qg —t qg, cross the s and t channels in Eq. (17.75) and 
multiply by 3/8 since there is one gluon color average. This gives 


%(qg 99) 

at 


47 ra 2 
9s 2 



s 9 /s 2 + it. 2 \ 
it. ^ 4 V t 2 / 


(17.77) 


The cross section for qg —t qg is identical. 

The final elementary process of QCD is gluon-gluon scattering. This has 
no QED analogue, and is rather tedious to evaluate. There are four leading- 
order diagrams, shown in Fig. 17.12. We discuss this process also in Prob¬ 
lem 17.3. The final result for the spin- and color-averaged cross section is 


da 

dt. 


(99 99) 


9Tia 2 
2 s 2 


3 


tu 

£2 


sit, st 

l 2 ~ T 2 


(17.78) 


The various parton cross sections listed in this section can be combined 
with the parton distribution functions to predict the cross section for jet pro¬ 
duction in hadron-hadron collisions. As an example, we show in Fig. 17.13 a 
comparison of the invariant mass (s) distribution predicted for parton-parton 
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Figure 17.13. Two-jet invariant mass distribution in pp collisions at E cm = 

1.8 TeV, as measured by tlie CDF collaboration, F. Abe, et. al., Phvs. Rev. 
D48, 998 (1993). Tlie measurement is compared to a leading-order QCD 
calculation using the CTEQ structure functions described in Fig. 17.6. The 
three lower curves show the invariant mass distributions for the three compo¬ 
nents of the theoretical prediction: quark-quark (and antiquark) scattering, 
quark-gluon scattering, and gluon-gluon scattering. 

scattering with the invariant mass distribution of two-jet events observed in 
high-energy pp collisions. The overall normalization of the theoretical predic¬ 
tion is uncertain by about a factor of 2 due to the ambiguity of the choice 
of Q 2 used to evaluate a s (Q' 2 ) in the parton cross sections, and due to simi¬ 
lar ambiguities in deriving parton distributions from deep inelastic scattering 
cross sections. This uncertainty is reduced to about 30% when corrections of 
order a s are included. Still, it is remarkable that the lowest-order QCD pre¬ 
diction tracks the observed distribution as a function of the two-jet invariant 
mass as it falls by six orders of magnitude. Thus, for the jet production cross 
section, as for hard processes involving leptons, QCD indeed gives a reason¬ 
able description of the behavior of the strong interactions at large momentum 
transfer. 
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17.5 Parton Evolution 

Now that we have examined the predictions of QCD at the leading order for 
several strong interaction processes, we should investigate the corrections to 
these predictions at the next order in a s . We saw in Section 17.2 that the 
corrections from individual diagrams may contain mass singularities, singu¬ 
larities associated with collinear emission processes which appear in the limit 
of zero mass. For the process of e + e _ annihilation to hadrons, we saw that 
these mass singularities, and the infrared divergences from soft gluon emis¬ 
sion, cancel in the expression for the total cross section. It can be shown that 
this is a general feature of processes in which quarks and gluons are produced 
in the collision of leptons or photons. However, when quarks or gluons ap¬ 
pear in the initial state of a parton subprocess, the corrections to the process 
will, in general, have mass singularities that do not cancel. In this section we 
will demonstrate this effect and work out its physical interpretation. We will 
find that these singular terms predict a violation of Bjorken scaling by terms 
depending on the logarithm of the momentum scale. In fact, they lead to a 
precise set of differential equations that govern the momentum dependence of 
the parton distributions. 

The basic phenomena associated with mass singularities in QCD are al¬ 
ready present in the physics of collinear photon emission in QED at high 
energies, and so it is most straightforward to begin by studying that case. In 
this section, we will show that collinear photon emission leads to an analogue 
of a parton distribution function for the electron. We will derive a differen¬ 
tial equation describing this distribution function, first constructed by Gribov 
and Lipatov. Finally, we will generalize this equation to QCD, following the 
construction of Altarelli and Parish* 

In Chapters 5 and 6, we studied several examples of QED processes 
that involved diagrams with t- or w, -channel singularities. In these cases, we 
found that the total cross section was generally enhanced by an extra factor 
log(s/m 2 ) in the high-energy limit. For example, in Eq. (5.95) we saw that 
the w,-channel exchange diagram in Compton scattering, Fig. 17.14(a), leads 
to an integral that, in the high-energy limit, takes the form 

r d cos 8 
,/(l + cos 8)' 

The singularity as cos 8 —> — 1 is cut off by the electron mass, leading to the 
logarithmic enhancement factor. Thus the collinear photon emission costs a 
factor that is not a but rather a\og(s/m' 2 ). Emission of multiple collinear 
photons, as in Fig. 17.14(b), gives contributions of order (a log(s/m 2 ))". To 
improve the accuracy of perturbation theory, it would be useful to find a pro¬ 
cedure for summing these terms to all orders in a. In QCD, the corresponding 

fy. N. Gribov and L. Lipatov, Sov. J. Nucl. Phvs. 15, 438 (1972); G. Altarelli 
and G. Parisi, Nucl. Phys. B126, 298 (1977). We also strongly recommend reading 
tlie papers of J. Kogut and L. Susskind, Phys. Rev. D9, 697, 3391 (1974). 



17.5 Parton Evolution 


575 


Figure 17.14. Diagrams with mass singularities associated with collinear 
photon emission: (a) leading order; (b) higher order. 


Figure 17.15. General form of diagrams with mass singularities in QED. 
factor for collinear gluon emission would be 

ot s (Q 2 ) log 

>i~ 

where // is the momentum scale where nonperturbative QCD effects become 
important. Comparing with Eq. (17.17), we see that this product is of order 1. 
Thus, in this case, the resummation of large logarithms is essential if we are 
to make any quantitative predictions. 

In QED, diagrams with mass singularities associated with one collinear 
emission are of one of the forms shown in Fig. 17.15. In each case, the circle 
represents a scattering process with large momentum tranfer. The mass singu¬ 
larity appears when the denominator of the intermediate propagator vanishes, 
that is, when the intermediate state is almost on-shell. Thus, it is natural to 
consider the first diagram in Fig. 17.15 to be a transition to a real photon 
and an almost-real electron, followed by the interaction of the electron with 
the remaining particles in the amplitude. The second diagram should have a 
similar interpretation with an almost-real photon in the intermediate state. 

The only subtlety comes in defining the polarization of the intermediate- 
state particle. For the case of an intermediate-state electron, the numerator 
of the propagator is 

l/, = ^u s {k)u s {k). (17.79) 

Thus, when k 2 —> 0, the photon emission vertex and the remaining part of 
the amplitude are contracted with on-shell polarization spinors for a massless 



576 Chapter 17 Quantum Chromodynamics 


electron. The analogous statement for the diagram with the photon in the in¬ 
termediate state would be that the electron emission vertex and the remaining 
photon amplitude should be contracted with physical transverse polarization 
vectors for the intermediate-state photon. Since the numerator of the photon 
propagator is g 1 ' 1 ', it is not obvious that the photon propagator reduces in this 
way. But it is true. To see this, use the expansion for g in terms of massless 
polarization vectors given in Eq. (16.20): 






c Ti c Ti" 


(17.80) 


Here e^ i are transverse polarization vectors. The forward polarization vector 
e+ is proportional to the photon momentum . When we contract with 
the QED scattering amplitude on the right, we will obtain zero by the Ward 
identity, and the contraction of with the electron emission vertex similarly 
gives zero. Thus, for the purpose of computing the singular term as the photon 
momentum q goes on-shell, we may replace 


-ig 




+* 


- Y" f 11 f v * 
2 e Ti e Ti 


(17.81) 


and evaluate the photon emission and absorption amplitudes with transverse 
polarization vectors. 


Matrix Element for Electron Splitting 

By replacing the numerator of the intermediate propagator with a sum over 
polarization vectors, we decouple the photon or electron emission vertex from 
the rest of the diagram. We will now evaluate this vertex explicitly between 
physical polarization states of massless particles. The kinematics is shown in 
Fig. 17.16. The two final particles should be almost collinear, with a small rel¬ 
ative transverse momentum. We can choose the incident electron momentum 
to lie along the 3 axis and the outgoing momenta to lie in the 1-3 plane. Let 
3 be the fraction of the energy of the initial electron that is carried off by the 
photon. Then the three 4-momenta can be written as 


p = (p,0,0,p), 

q ~ {zp,pj_,0,zp), (17.82) 

k ((1 -z)p, -p_ l, 0, (1 -z)p). 

These three vectors satisfy p 2 = q 2 = k 2 = 0, up to terms of order p 2 ± . 

In the process where a real photon is emitted, we should have p 2 and q 2 
exactly zero, and k 2 slightly off-shell by an amount of order p 2 ± . We will need 
to know the value of k 2 , which appears in the virtual electron propagator. So 
let us modify Eqs. (17.82) to satisfy the condition q 2 = 0 up to terms of order 
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Figure 17.16. Kinematics of tlie vertex for emission of a collinear electron 
or photon. 

p 4 j_, rewriting q and k as 

p 2 

q = (zp,p±,0,zp - -^), 

1ZV s (17.83) 

k = ((1 :)/'. i>.u.. 0 . (1 :)/'+ |^). 

With this modification, 

k~ = -pi-2(l-z) 1 -^ + 0(p{). 

Thus, if the photon is real and the electron is virtual, we have 

q 2 =0, k 2 = -^=k (17.84) 

z 

Reciprocally, in the process with a real electron and a virtual photon, 

k 2 = 0, </ 2 = -^y- (17-85) 

These more accurate expressions will be needed only in the propagator of the 
virtual particle. The matrix element of the electron-photon vertex begins in 
order p±, so it is not significantly affected by the modification of (17.82) to 
(17.83), and is the same (to lowest order) no matter which particle is virtual. 

We now calculate the matrix elements of the QED vertex between massless 
states of definite helicity. If the initial electron is left-handed, the final electron 
must also be left-handed, by helicity conservation. Then the photon emission 
vertex is given by 

iM = U£{k){-ie'y l _ t )u L {p)eT t {q)i (17.86) 

where the photon polarization vector may be either left- or right-handed. 
Recalling the helicitv-basis expressions 

-° ^ , u L (p) = \[2 pO ^ ^ ^ (for m = 0), 

we can write more explicitly 



iM = —ie\j2(l—z)p\/2pi}{k)a l ^{p) ej?(g). 


( 17 . 87 ) 
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To order p ±, the left-handed spinors are 

€<P) = (”) , m = ( P±/2i 1 1 - i>P ) ■ (1T.88) 

The polarization vectors for the photon are 

*»-£**-$* (irw 

Notice that, when these vectors are contracted with the Pauli matrix in 
Eq. (17.87), the first two components of the right-handed polarization vec¬ 
tor give (a 1 — ia 2 ) = 2a ~, which annihilates £(p). The only remaining term 
comes from the i = 3 component, and we find 


iM(e, > e L 7 R ) = ie (17.90) 

For the left-handed photon polarization, there is an additional contribution 
from the first two components of e* L . These add to 

iM(el ->• e^j L ) = ie P±- (17.91) 


Parity invariance implies that the values of the matrix elements are unchanged 
if all helicities are flipped; this immediately gives the required matrix elements 
for the case of an initial e^. The squared matrix element, averaged over initial 
helicities, is therefore 


l £ \m? 

pols. 


2 e 2 p\ 1 + ( 1 — z) 2 
z(l—z) [ z 


(17.92) 


The first term in the brackets comes from a photon with spin parallel to the 
electron spin; the second term comes from a photon with spin opposite to the 
electron spin. 


The Equivalent Photon Approximation 

Now we have all the pieces needed to compute the cross sections for the 
processes shown in Fig. 17.15. We first consider the process with a virtual 
photon. Call the initial state on the right-hand side of the diagram X and the 
final state Y, and let M 7 x represent the matrix element for the scattering 
of the photon from X. We will assume for simplicity that X is unpolarized, 
so that the scattering cross section does not depend on the virtual photon 
polarization. Then the complete diagram gives a cross section 

" = (i + ,.,)W t /<sNe ! My [5 £ l ' M| 1 (?)W. <™»> 

where vx is the velocity of X and / dRy is the phase space integral over Y. 

The integral has a singularity when k is collinear with the incident electron 
momentum p. To isolate the singularity, substitute for k° and q 2 from Eqs. 
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(17.82) and (17.85) and rewrite the integral over k as 

d 3 k = dk 3 d 2 k± = pdz ■ 'irdp 2 L . (17.94) 


Then the cross section can be expressed as 


(7 = 


pdz dp 


16n' 2 (l—z)p 
dzdp j 


[j2>r 

■liEW 


(i-O 5 


P± (l+v x )2zp2B x 


2 ] z (l-z)' 2 
p± 


a(jX ->■ Y). 


I dn Y \M lX \ 2 

(17.95) 


J 167T 2 (1 — z) 

Finally, insert the spin-averaged electron emission vertex (17.92), to obtain 
dzdp\ z(l— z) 2e 2 p 2 j_ [1 + (1 —z) 21 


a = 


l" 


167T 2 z(l — z) [ 


z 


f 

dp 2 j_ a 

ri + (i-^) 2 i 

J 

p'i 2tt 

z 


<j(jX > Y ) 

<j(-/X ->■ I'). (17.96) 


The integral over p 2 ± runs from momentum transfers of order s down to the 
electron mass m 2 , which cuts off the singularity. Thus, our final result is 


1 

r(e~X —> e~Y) = [dz Z- log ■ 

J m 


l + (l-3) 2 


r (■} ,V >V). (17.97) 


The cross section on the right-hand side is computed for a real, transversely 
polarized photon of momentum zp. The factor log (s/m 2 ) represents the mass 
singularity. This formula is the Weizsacker-Williams equivalent photon approx¬ 
imation, which we encountered earlier in Problems 5.5 and 6.2. 

Formula (17.97) takes on a new significance when it is juxtaposed with 
the QCD predictions of the previous two sections. This QED formula has just 
the same form as a parton model expression, with the Weizsacker-Williams 
distribution function 


Mz) 


a _ s 

^-iog — 

Z7T m z 


i + a-2) 2 


(17.98) 


playing the role of the probability to find a photon of longitudinal fraction z 
in the incident electron. 


The Electron Distribution 


The first diagram of Fig. 17.15, with an emitted photon and a virtual electron, 
can be treated in the same way. The analogue of (17.93) is 


1 


a = 




{l+vx)2p2E x 
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Following the steps that led to (17.97), we find 

rT(f - V ^ 7r) = / iSfc [\ E \ M ?] ^ • a^h(e-X Y) 

f dzdp 'J_ 3(1—2:) 2e 2 p^ 1 + (1—3) 2 

J 167T 2 3(1 — 3) _ 2 

= fiz [ *Wl±iiz£>il ,„ (e -x^r), (17.99) 

J J PI 27T [ 3 

0 

where the intermediate electron carries a longitudinal fraction (1— 2 ). 

It is tempting to substitute x = (1— 2 ) and interpret the factor multiplying 
the cross section under the integral in (17.99) as the parton distribution for 
finding an electron parton in the electron. This would give 

^ = {um 

However, this expression is not adequate. Most obviously, it does not take 
into account the processes without radiation, in which the electron remains 
an electron at the full energy. This is easily remedied by considering (17.100) 
as the order-o correction to the most naive expectation, 

f< 0) (x)=S(l-x), (17.101) 

in which we consider the electron to contain only an electron at the full energy. 
Unfortunately, the sum of (17.101) and (17.100) still does not give an adequate 
description of the electron distribution, for two reasons. First, Eq. (17.100) 
diverges near x = 1, and we need a prescription for treating this singularity. 
Second, while Eq. (17.100) takes into account the virtual electrons moved to 
longitudinal fraction x from x = 1 by radiation, it does not take into account 
the concomitant loss of electrons from the delta function peak at x = 1. 

The divergence of (17.100) at x = 1 corresponds to the emission of soft 
photons. We saw in Section 6.5 that the emission of soft photons does not af¬ 
fect the rate of a QED reaction. Order by order in a, one finds that infrared- 
divergent positive contributions to the total rate from the emission of soft 
photons are balanced by negative contributions from diagrams with soft vir¬ 
tual photons. In the present example, the negative contribution must decrease 
the weight of the process in which no photon is emitted. Thus, to order a, the 
parton distribution for electrons in the electron should have the form 

f e {x) = <5(1 -x) + ^~ log At f 1 + ' C - .4(5(1 - x)\. (17.102) 

27r m~ \(1 —x) J 

The coefficient .4 results from diagrams with virtual photons that we have 
not computed. However, the effect of these diagrams is easy to understand; 
they subtract from the delta function the probability that has been moved 


• a(e~X ->■ Y) 
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to lower x by radiation, so that the integral over the full term of order a is 
zero. Another way of expressing this criterion is that A is determined by the 
condition that the electron contain exactly one electron parton, 

1 

Jdxf e (x) = 1. (17.103) 

o 


(This equation will be modified below, when we include pair-creation pro¬ 
cesses.) 

It is not so clear how to integrate over the singular denominator in (17.100) 
to determine A explicitly. It is conventional to define a distribution that can 
be integrated by subtracting a delta function from the singular term. Define 
the distribution 


1 

(1—a?)+ 


(17.104) 


to agree with the function 1/(1— x) for all values of x less than 1, and to have 
a singularity at x = 1 such that the integral of this distribution with any 
smooth function f(x ) gives 


Less formally, 



f(x) 
( 1 —• x )+ 



/(*)-/( 1) 
(1-.t) 


1 — 6 


(i—*}+ J™L(i- 


—r-0(l — x — e) — 5(1 — x) ( 
-x) J 


dx' 


(1—;c') J ’ 


(17.105) 


(17.106) 


The more formal definition (17.105) is often easier to use in practice. 

Using this definition, we can bring a piece of the delta function into the 
singular term of (17.102) by changing the denominator (1— x) to (1— x) + . 
Then, to normalize (17.102), we need the integral 



1 + x 2 
(1—;r) + 



o 


3 

2‘ 


Our final form of the electron distribution, to order a, is 


Cv Q 

f e (x) = <5(1 - X) + — log ^77 

Z7T m z 


1 + x 2 3 


(17.107) 


This distribution is now properly normalized, but it is still highly singular 
near r 1. Thus, we should expect higher-order corrections to the electron 
distribution function to be important in this region. We must, then, think 
about how to treat the emission of many collinear photons. 
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Figure 17.17. Higlier-order diagrams with collinear photon emission: 
(a) two collinear photons; (b) many collinear photons. 


Multiple Splittings 

In fact, it is not difficult to extend the analysis we have just completed to 
account for emission of many collinear photons. Consider the process shown in 
Fig. 17.17(a), in which photon 1 is radiated with a transverse momentum pi± 
and photon 2 with a transverse momentum p 2 ±. The emission of photon 2 can 
be computed just as we did above. If p 2 ± -C pu_, the first virtual electron is 
very close to mass shell compared to pj ± and so we can ignore its virtuality 
in computing the emission of the photon 1. The double photon emission gives 
a contribution of order 



1 / a n 


log' 


s 

TO 2 


In the opposite limit, p 2 ± pu_, there is no denominator of order p\ j_, and 
so we do not find a double logarithm. Only in the case p 2 ± <pu can the 
contribution of order or compete with the contribution of order a. 

This argument extends to the emission of arbitrarily many collinear pho¬ 
tons, Fig. 17.17(b). The region of integration over the photon phase space 
corresponding to the ordering 


Pi± > P2± » P3± » • • • 


gives a contribution that contains the factor 


1 

n\ 



s 

TO 2 


(17.108) 


(17.109) 


If the photon transverse momenta are ordered in any other way, the contribu¬ 
tion from that region contains at least one less power of the large logarithm 
at the same order in a. If condition (17.108) holds, the virtual electron mo¬ 
menta are increasingly off-mass-shell as one proceeds from the outside of the 
diagram toward the hard collision. In this case, the electron momenta are said 
to be strongly ordered. 
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This set of conclusions has an interesting physical interpretation. Since 
the intermediate electrons are increasingly virtual as we go into the diagram, it 
is natural to interpret them as components of the physical electron when this 
particle is analyzed at successively smaller distance scales. The intermediate 
electron with k 2 ~ p]_ may be thought of as a constituent of the electron 
made visible when the wavefunction of the physical electron is probed with a 
resolution A r ~ (/>.) 1 . In this picture, the electron seen at one resolution 
can be resolved at a finer scale into a more virtual electron and a number of 
photons. 

^From either the perspective of computing Feynman diagrams or the 
grander perspective of the electron structure, it is useful to imagine the split¬ 
ting of the electron into a virtual electron plus photons to be a continuous 
evolution process as a function of the transverse momentum of the electron 
constituent. To describe this process mathematically, we introduce an explicit 
p± dependence of the electron and photon distribution functions. We define 
the functions f~ f (x,Q) and f e (x,Q) to give the probabilities of finding a pho¬ 
ton or an electron of longitudinal fraction x in the physical electron, taking 
into account collinear photon emissions with transverse momenta p± < Q. 
If Q is slightly increased to Q + A Q, we must take into account the pos¬ 
sibility that an electron constituent in f e (x,Q) will radiate a photon with 
Q < P.L < Q + A Q. The differential probability for an electron to split off a 
photon that carries away a fraction 3 of its energy is 


a dp]_ 1 + (1— z) 2 
2ir p 2 ± z 


(17.110) 


The new photon distribution can therefore be computed as follows: 


f^{x,Q+AQ) 


= Mx, q)+ j dx'j dz r “ AQ ~ 1 + (1 
0 0 

ft oi _i_ f dz 

= M*,Q) + -qJ t 


2tt Q 2 
a 1 + ( 1 — z) 2 


7r 3 


fe(- z ,P±)- 


fe{x',p±)S{x - ZX 1 ) 

(17.111) 


Passing to a continuous evolution, we find that the function f 7 {x, Q ) is deter¬ 
mined by the integral-differential equation 


d 


d log Q ' 


r dz 

a 1 + (1— z) 2 

I ~ 

_7T Z 


/e(-,Q). 

Z 


(17.112) 


Similarly, the distribution of component electrons in the physical electron 
will evolve with Q, reflecting the appearance of electrons at lower values of x 
due to photon radiation, and the disappearance of these electrons at higher x. 
The term in brackets in Eq. (17.107) gives a correct accounting of both effects 
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for the radiation of a single photon. Thus, the electron distribution evolves 
according to 


d 

cl log Q 


fe(x,Q) 



+ ls(i 



fe(- z ,Q)- 


(17.113) 


By integrating these integral-differential equations using appropriate ini¬ 
tial conditions, we sum all of the logarithmically enhanced terms of the 
form (17.109). The initial conditions should be fixed at a point that will repro¬ 
duce the correct denominator of the logarithms in Eqs. (17.98) and (17.107). 
Thus, we should set 


fe(x,Q)=S( l-x), f 7 (x,Q)= 0, (17.114) 

at Q 2 ~ m 2 . 

The resulting distribution functions can be used to compute the cross sec¬ 
tions for electron hard scattering from an arbitrary target. Then Eqs. (17.97) 
and (17.99) should be replaced by 


cr(e X —> e + ny + 


a(e X —> ny + 


Y) = j dx f^(x,Q)a( r yX —> 1'),. 

0 

1 

Y) = Jdx f e (x,Q)cr(e~X —» Y), 
o 


(17.115) 


where the cross sections under the integrals are computed for a photon or 
electron carrying a fraction x of the original electron momentum, the functions 
f 7 (x,Q), f e (x,Q) are the solutions to Eqs. (17.112) and (17.113), and the 
momentum Q is chosen as a characteristic momentum transfer of the y.Y or 
e~X subprocess. 


Photon Splitting to Pairs 

The evolution equations for f 1 (x) and f e (x) need one more modification be¬ 
fore they can be considered complete. As written, these equations account for 
the radiation of photons by electrons to all orders. However, they omit an¬ 
other process that is of the same order in a: the splitting of a photon into an 
electron-positron pair. We must include this process in our evolution equa¬ 
tions, because the process shown in Fig. 17.18, for example, has the same 
logarithmic enhancement as that shown in Fig. 17.17(a). 

We can compute the effects of photon splitting in the same way that we 
computed with effects of photon radiation. The basic kinematics of the process 
is very similar, as shown in Fig. 17.19; the only difference is that the photon 
is now in the initial state, while the final state consists of an almost collinear 
electron-positron pair. We need to work out the analogue of Eq. (17.92) for 
this process. 
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Figure 17.18. A process that involves e + e pair creation enhanced by a 
collinear mass singularity. 


Figure 17.19. Kinematics of the vertex for photon conversion to a collinear 
electron-positron pair. 


Consider the case in which the outgoing electron is left-handed. Then 
the outgoing positron must be right-handed, by helicity conservation; its spin 
wavefunction will contain a left-handed spinor. Let us take the electron mo¬ 
mentum to be k , given by Eq. (17.82), and the positron momentum to be q. 
Then the vertex gives a matrix element 

iM = -ieu L (k)jf,VL(q)e^(p), (17.116) 

where the photon polarization vector can be either left- or right-handed. When 
we insert the explicit forms for the massless spinors, we obtain 

iM = ie\/2(l-z)p\/2zp£ 1: (k)a'£,(q) ■ £?(p), 

where the electron and positron spinors are given, to order p±, by 

«?) = ( - p i 2zp ) . m = (^ /2< 1 1 -’>' ) 


The polarization vectors for the photon are 


4(f) = 


4(f) = ^=(M>°)- 


Dotting these vectors into a *, we find for the polarized matrix elements 


iM(j L -> e L e n) = ~ ie 


. \/2z(l-z) 


P.Li 
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and 


iM(‘J r -t e L eJ l ) = +ie 


\/2:(l z) 

d- 2 ) 


-P- L- 


Again, the matrix elements are unchanged if all helicities are flipped. Thus 
the squared matrix element, averaged over initial photon polarizations, is 

1 v I vii 2 - 2 e 2 P±- 
2^' Ml ~^z) 


[2 2 +(l-2) 2 ], 


(17.117) 


pols. 


where z is the momentum fraction carried by the positron. The first term in 
the brackets comes from processes in which the spin of the positron is parallel 
to the spin of the photon; the second term comes from processes where the 
electron spin is parallel to the photon spin. 

The squared matrix element (17.117) generates an evolution of constituent 
photons into electrons and positrons. The form of the evolution equation is 
similar to (17.113), but with the photon distribution on the right-hand side, 
and with the expression in parentheses replaced by 

(z 2 + (l-,j) 2 ). (17.118) 

When we create an electron-positron pair, we must remove a photon; this 
requires a negative term in the evolution equation for the photon distribu¬ 
tion (17.112) that contains a delta function multiplying the normalization of 
(17.118): 

l 

Jdz(z 2 + ( 1-*) 2 ) = | (17-119) 

o 


Evolution Equations for QED 

Including the effects of pair creation, we find the complete evolution equations 
for electron, positron, and photon distributions in QED. These equations, 
originally derived by Gribov and Lipatov, sum the leading logarithms from 
collinear singularities to all orders in a. The evolution equations take the form 

i 

dl^Q / ' (;E ’ Q) = n / T{ P ' : ^ e W [ /e( I’ Q) + M ~ ’ Q) ] 

X 

+ t 3 7 ^- 7 (2)/ 7 (—, Q) j, 


d 


d log Q 
d 


i 

fe(x,Q) = ~ / —{p&e(z)M-,Q) + P e ^y(z)M-,Q)\, 

n J z { z z J 

X 

1 

J ^{p e ^z)M*,Q) + Pe^z)M*,Q)y 


(17.120) 


n n fe(X,Q) = ~ 

alogQ 7r 
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The splitting functions are given by 

u ~ z )+ 1 

2 ’ (17.121) 

P e< - 7 (z) = z' 2 + (1 -z)' 2 , 

P 7 ^(z) = -^'(l-~)- 

To obtain the distribution functions for an electron relevant to a given momen¬ 
tum transfer Q, we should integrate these equations with the initial conditions 

fe(x,Q)=6( l-x), Mx,Q)= 0, Mx,Q) = 0, (17.122) 

at Q = m. With different initial conditions, the same equations give the dis¬ 
tribution functions for a physical positron or photon. The solutions to these 
equations are used as in Eq. (17.115) to compute cross sections involving 
processes induced by electrons, positrons, or photons that involve large mo¬ 
mentum transfer. 

The evolution equations (17.120) are constructed in such a way as to con¬ 
serve electron number and longitudinal momentum. Thus, the basic sum rules 
(17.36) and (17.39) satisfied by the parton distributions of hadrons also ap¬ 
ply to the QED distribution functions. Specifically, the distribution functions 
of the electron contain one net electron constituent, 

l 

Idx[f e (x , Q) - f e (x, Q)] = 1, (17.123) 

0 

and account for the total momentum of the physical electron, 

l 

J dxx[f e (x,Q) + fe(x,Q) +/ 7 (x,Q)] = 1. (17.124) 

0 

It is an instructive exercise to verify explicitly, using Eq. (17.120), that the 
values of these integrals do not depend on Q. 

The Altarelli-Parisi Equations 

If we encounter mass singularities in QED associated with collinear photon 
emission, we must also encounter mass singularities in QCD associated with 
collinear gluon and quark emission. If we compute the corrections of order a s 
to the leading-order parton cross sections discussed in Sections 17.3 and 17.4, 
using massless quarks and gluons, we will find that these correction terms 
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diverge when we integrate over the collinear configurations. Thus the parton- 
model expressions, at least in their simplest form, break down already at the 
next-to-leading order in a s . 

However, assuming that the singularities of QCD are no worse than those 
of QED, the considerations of the previous section tell us how to treat these 
singular terms. In QED, we found it natural to include the large corrections 
associated with the mass singularities in the parton distributions rather than 
in the hard-scattering cross sections. Viewed in this way, the singular terms 
supply the kernel of an evolution equation for the parton distributions as a 
function of the logarithm of the momentum scale. Hard scattering with a mo¬ 
mentum transfer Q probes the electron at a distance of order Q -1 . When 
the electron wavefunction is resolved to very small scales, it appears as a con¬ 
stituent electron, carrying only a fraction of the total longitudinal momentum, 
plus a number of constituent photons and electron-positron pairs. Any one of 
these constituents that carries a substantial fraction of the total electron mo¬ 
mentum can initiate a hard-scattering process. 

Precisely the same logic applies to the calculation of QCD cross sections. 
The contributions from the region of collinear gluon or quark emission should 
be associated with the parton distribution functions rather than with the 
hard-scattering cross sections. If we make this association, we find that the 
parton distributions are no longer independent of the momentum Q that char¬ 
acterizes the hard-scattering process; rather, they now evolve logarithmically 
with Q. For example, the basic equation (17.30) for deep-inelastic scattering 
will become 

^( e "Hr.I)= (^//(x,Q)Q 2) .^l[l + (l-y) 2 ], (17.125) 

and so Bjorken scaling will be violated. Since this violation takes place only on 
a logarithmic scale in Q 2 , it will be a subtle effect, and approximate Bjorken 
scaling will still be a prediction of QCD. But the violation of Bjorken scaling 
is inevitable, since QCD is a quantum field theory with degrees of freedom at 
all momentum scales. As we probe the proton wavefunction at increasingly 
short distances, we excite the high-momentum degrees of freedom and resolve 
the wavefunction into an increasing number of quarks, antiquarks, and gluons. 

The evolution of the QED parton distributions, governed by Eq. (17.120), 
is characterized by the parameter a/n, so the parton distributions change by 
~1% as Q is changed by a factor of 10. In QCD, the corresponding factor 
governing the rate of evolution should be a s {Q)/ it. Thus, when Q is very 
small, the evolution is rapid and contributions of higher order in perturbation 
theory are important. Ultimately, the initial conditions for the evolution are 
determined by the form of the proton wavefunction at large distance scales, 
which cannot be calculated using Feynman diagrams. On the other hand, 
when Q is large, well above 1 GeV in practice, the evolution becomes slow 
and is dominated by the leading order in perturbation theory. In that case, 
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Figure 17.20. The three vertices that contribute to parton evolution in 

QCD. 

QCD perturbation theory makes precise predictions for the form of the evo¬ 
lution of the parton distributions, and these predictions can be tested against 
experiment. 

To derive the evolution equations of parton distributions in QCD we can 
use the same techniques and logic that we used above for QED. There is a 
subtlety, that the reduction of the gluon propagator to transverse polarization 
states in the limit q 2 —1 0, Eq. (17.81), cannot be proved so simply as in 
QED. However, the result is correct also in the non-Abelian case.* Once this 
technical point is resolved, the kinematics of collinear emission is exactly the 
same as in QED. Thus we find evolution equations of the same form as in QED, 
modified only by the replacement of a by a s , the insertion of appropriate color 
factors, and the accounting of the effects of the three-gluon vertex. 

Collinear emission processes in QCD involve the three vertices shown in 
Fig. 17.20. Of these, the first two have the same Lorentz structure as those 
shown in Figs. 17.16 and 17.19. The only difference, aside from the strength 
of the coupling constant, comes in the color indices. We will treat color just 
as we treated spin in the preceding analysis: We average over initial colors, 
and sum over final colors. Then the first vertex of Fig. 17.20, representing the 
splitting of a quark into a quark and a gluon, receives the color factor 

ltr[t a t a ]=C 2 (r) = I- (17.126) 

The second vertex, representing the splitting of a gluon into a quark-antiquark 
pair, receives the factor 

ftr [t a t a } = \. (17.127) 

The third vertex in Fig. 17.20 represents the splitting of a gluon to two 
gluons, an effect that is new to the non-Abelian case. It is straightforward to 
compute the contribution of this vertex to the evolution equations by taking 
the matrix elements of the vertex between transverse gluon states of definite 
helicitv. This calculation is the subject of Problem 17.4. 


*See, for example, J. Collins and D. Soper, in A. Mueller, Quantum Chromodv- 
naniics (World Scientific, Singapore, 1991). 
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By accounting for all of these effects, we can modify the QED evolution 
equations (17.120) into the correct set of evolution equations for parton dis¬ 
tributions in QCD. These are known as the Altarelli-Parisi equations. They 
describe the coupled evolution of parton distributions ff(x,Q ), ff(x,Q) for 
each flavor of quark and antiquark that can be treated as massless at the 
scale Q. together with the parton distribution of gluons, f g {x,Q). Explicitly, 


X f 

1 

f f { x ,Q) = ^l I c l±ip q ^ z )f f( *,Q)+P q ^ g (z)f g (-,Q)\, 
n J z { z z J 


cllog Q 


cIlogQ f/(X,Q ) ~ ) / 3 + Pq^gWfg 

(17.128) 

The first three splitting functions can be taken from Eqs. (17.121), multiplied 
by the color factors computed in Eqs. (17.126) and (17.127): 


Pq^q( z ) — o 


1 + 2“ 3 

+ -<)(!-2) 


3 L(1 2), 2 


Pg^q( z ) — g 

1 


1 + ( 1 - 2 ) 


21 


(17.129) 


p q<-g&) = ^[ z ' 2 + (i- 2 ) 2 ]- 

The fourth splitting function requires also the computation of Problem 17.4; 
the result is 


p g^g( z ) = 6 


( 1 - 2 ) + 2 


(1 — 2 )h 


+ 2 1 - 




. (17.130) 


The final term in this expression, which is proportional to iif, the number 
of light quark flavors, is the subtraction term associated with gluon splitting 
into qq pairs. The Altarelli-Parisi equations describe the evolution of parton 
distributions for any hadron, or any hadronic constituent, up to corrections 
of order a s that are not enhanced by large logarithms. 

Our derivation of the Altarelli-Parisi equations respects the conservation 
laws of QCD for quark numbers and longitudinal momentum. Thus, the equa¬ 
tions must respect the parton-model sum rules (17.36) and (17.39). As in the 
QED case, it is instructive to verify explicitly that these integrals are inde¬ 
pendent of Q. 
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Figure 17.21. Tlie u quark parton distribution function xf u (x,Q) at Q = 

2, 20, and 200 GeV, showing the effects of parton evolution according the 
Altarelli-Parisi equations. These curves are taken from the CTEQ fit to deep 
inelastic scattering data described in Fig. 17.6. 

In QED, we could use the evolution equations to explicitly compute the 
structure function of the electron. In QCD, this is no longer possible, because 
the initial conditions required to integrate the equations are determined by the 
strong-coupling region of QCD and so are not known a priori. However, one 
can determine the initial conditions of the proton structure experimentally, by 
measuring the cross section for deep inelastic scattering at a given value of Q 2 . 
One can then predict the structure functions, and thus the deep inelastic cross 
sections, at higher values of Q 2 . There is one subtlety in this analysis: The 
gluon distribution is not directly measured in deep inelastic scattering, but 
it does enter the evolution equation for the quark distributions. Thus, some 
of the information on the Q 2 dependence of deep inelastic scattering simply 
goes into determining the gluon distribution. However, the gluon distribution 
is absolutely normalized by the momentum sum rule (17.39), so the evolution 
equations have predictive power even if this distribution must be fit from the 
data. 

The Altarelli-Parisi equations predict a characteristic form for the evolu¬ 
tion of parton distribution functions, shown in Fig. 17.21. Partons at high x 
tend to radiate and drop down to lower values of x. Meanwhile, new partons 
are formed at low x as products of this radiation. Thus, the parton distribu¬ 
tions decrease at large x and increase much more rapidly at small x as Q 2 
increases. We can picture the proton as having more and more constituents, 
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Figure 17.22. Dependence on Q 2 of the combination of quark distribution 
functions Fo = xQjff(x, Q 2 ) measured in deep inelastic electron-proton 
scattering. The various curves show the variation of Fo for fixed values of x, 
and the comparison of this variation to a model evolved with the Altarelli- 
Parisi equations. The upper six data sets have been multiplied by the indi¬ 
cated factors to separate them on the plot. The data were compiled by M. 
Vircliaux and R. Voss for the Particle Data Group, Phvs. Rev. D50, 1173 
(1994), Fig. 32.2. The complete references to the original experiments are 
given there. 

which share its total momentum, as its wavefunction is probed on finer and 

finer distance scales. 
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Figure 17.22 shows the evolution of the combination of distribution func¬ 
tions that is measured in deep inelastic scattering, as a function of Q 2 . We 
see the characteristic decrease of the distribution functions at large x and the 
increase at small x. The data are compared to a model evolved according to 
the Altarelli-Parisi equations; this model apparently describes the data quite 
well. 


17.6 Measurements of a. s 

Before concluding our introductory survey of QCD, we should summarize the 
quantitative verification of the theory. We discussed precision tests of QED 
in Section 6.3, bringing together various measurements of the coupling a; the 
best determinations agree to eight significant figures. Since QCD perturbation 
theory works only for hard-scattering processes, with uncertainties due to soft 
processes that are difficult to estimate, this theory has not been tested to such 
extreme accuracy. Nevertheless, it is interesting to bring together the best 
available determinations of a s , to see how well they agree. 

In order to compare values of a s , it is necessary to express these using 
a common set of conventions. First, one must set the renormalization scale; 
a useful choice is the mass of the neutral weak boson Z°: mg = 91.19 GeV. 
Second, one must fix the renormalization scheme that defines the QCD cou¬ 
pling constant at this scale. It has become conventional to use as a stan¬ 
dard the bare coupling after regularization by modified minimal subtraction, 
Eq. (11.77). The resulting standard coupling constant is called a s m{nig )■ 

Measurements of from a number of types of experiments are summa¬ 
rized in Table 17.1. In Section 17.2 we saw that one can obtain a value of 
a s from the measurement of the total cross section for e + e _ annihilation to 
hadrons or, equivalently, the ratio R of the number of observed hadronic and 
leptonic events. An independent measurement of a s can be obtained from the 
fraction of e + e _ annihilation events with three-jet final states or, equivalently, 
from the transverse momentum distribution of produced hadrons relative to 
the jet axis. A number of measurements of this type are collected and averaged 
under the heading ‘Event shapes’. A similar measurement of a s is obtained 
from the measurement of the transverse momentum spectrum of W bosons 
produced from quark-antiquark annihilation at high-energy pp colliders. The 
gluon radiative correction to the vertex in deep inelastic neutrino scattering 
can also be used to extract a s . The rate of Bjorken scaling violation in deep 
inelastic scattering is controlled by a s , and so this effect provides another a s 
measurement. The decays of the lightest bb bound state X and the cc bound 
state tp are governed by QCD and yield a measurement of a s . Finally, the 
spectrum of cc and bb bound states can be computed numerically in terms 
of the QCD coupling constant, and the comparison with experiment gives a 
determination of ot s . 
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Table 17.1. Values of a s {mz ) Obtained from QCD Experiments 


Process: 

a s (m z ) 

Q (GeV) 

Deep inelastic scattering 

0.118 (6) 

1.7 

R in r lepton decay 

0.123 (4) 

1.8 

ip, T spectroscopy 

0.110 (6) 

2.3 

Transverse momentum of W production 

0.121 (24) 

4. 

Deep inelastic scattering (evolution) 

0.112 (4) 

5. 

Event shapes in e + e _ annihilation 

0.121 (6) 

5.8,9.1 

Rate for ip, X decay 

0.108 (10) 

9.5 

R in e + e _ annihilation (20-65 GeV) 

0.124 (21) 

35. 

R in Z° decay 

0.124 (7) 

91.2 


Tlie values of a s (mz) displayed in this table are obtained by fitting experi¬ 
mental results to the theoretical expressions given by perturbative QCD using 
minimal subtraction. The values of have been evolved to Q = mz using 
the renormalization group equation. R refers to the ratio of cross sections or 
partial widths to hadrons versus leptons. The numbers in parentheses are the 
standard errors in the last displayed digits. The column labelled ‘Q' gives an 
idea of the value of Q at which the measurement was made. (Typically, these 
measurements average over a range of Q , and that averaging is taken into ac¬ 
count in the quoted values of a s .) This table is based on the results compiled 
by I. Hinchliffe in his article for the Particle Data Group, Phys. Rev. D50, 

1297 (1994). This article contains a full set of references and a discussion of 
the sources of uncertainty in these determinations. 

The table shows the values of a s extracted from each of these measure¬ 
ments, expressed in terms of the value in the reference conventions, a S Ms(m z ). 
We see that several of the experiments determine a s to an accuracy of 5%, and 
that the various determinations are consistent with one another at this level. 
In Fig. 17.23, we have plotted the original values of a s represented in Table 
17.1, before conversion to a common scale, versus the momentum scale Q at 
which each was obtained. This comparison gives a striking direct verification 
of the running of a s . 

At the beginning of this chapter, we wrote down a candidate for the 
fundamental theory of strong interactions using only a few simple principles: 
the existence of quarks and the identification of their quantum numbers, and 
the idea that the theory of the quark interactions should be an asymptotically 
free gauge theory. It is remarkable that these simple considerations have led 
us to a description of strong interactions that is quantitatively correct for a 
broad range of phenomena in the hard-scattering regime where asymptotic 
freedom can be used as a tool for calculation. 
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Figure 17.23. Measurements of a s , plotted against the momentum scale Q 
at which the measurement was made. This figure was constructed by evolving 
the values of a s (mz) listed in Table 17.1 back to the values of Q indicated 
in the table. The value for e+e - event shapes has been split into two points 
corresponding to experiments at the TRISTAN and LEP accelerators. These 
values are compared to the theoretical expectation from the renormalization 
group evolution with the initial condition a s (mz) = 0.117. 


Problems 


17.1 Two-loop renormalization group relations. 

(a) In higher orders of perturbation theory, the expression for the QC'D 13 function 
will be a series 


P(9) 




(4 


h S 7 +- 


Integrate the renormalization group equation and show that the running cou¬ 
pling constant is now given by 


, 2 -, = 4w [ 1 _ bi log log {Q 2 /A 2 ) 

J b 0 [log(Q 2 /A 2 ) b 2 (log(Q 2 /A 2 )) 2 


where the omitted terms decrease as (log(Q 2 /A 2 )) 2 . 

(b) Combine this formula with the perturbation series for the e + e annihilation 
cross section: 

2 

<r(e + e _ -t hadrons) = ao • ^3 ^ QjJ ■ |^1 + ^ + a.o + C>(of )J. 
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The coefficient a -2 depends on the details of the renormalization conditions defin¬ 
ing a s . Show that the leading two terms in the asymptotic behavior of a(s) for 
large s depend only on bo and b i and are independent of a 2 and 60 • Thus the first 
two coefficients of the QCD /3 function are independent of the renormalization 
prescription. 

17.2 A direct test of the spin of the gluon. In this problem, we compare the 
predictions of QCD with those of a model in which the interaction of quarks is mediated 
by a scalar boson. Let the coupling of the scalar gluon to quarks be given by 


SC = gSqq, 

and define a g = g 2 /4w. 

(a) Using the technique described in parts (b) and (c) of the Final Project of Part I, 
compute the cross section for e + e“ -4 qqS to the leading order of perturbation 
theory. This cross section depends on the energies of the q, q, and S, which we 
represent as fractions aq, xo, a’3 of the electron beam energy, as in Eq. (17.18). 
Show that 

d' 2 a , _ 47T a 2 Q 2 a g x| 

dx\dx2 y 3s 4l(l- Xq)( 1 — Xq) 


(b) In practice, it is very difficult to tell quarks from gluons experimentally, since 
both particles appear as jets of hadrons. Therefore, let x a be the largest of 
a’l, a’o, a’ 3 , let xj, be the second largest, and let x c be the smallest. Sum over 
the various possibilities to derive an expression for d 2 a/dx a dxf ) , both in QCD, 
using Eq. (17.18), and in the scalar gluon model. Show that these models can 
be distinguished by their distributions in the x a ,Xf, plane. 


17.3 Quark-gluon and gluon-gluon scattering. 

(a) Compute the differential cross section 


da 

dt 


(qq -t 99) 


for quark-antiquark annihilation in QCD to the leading order in a s . This is most 
easily done by computing the amplitudes between states of definite quark and 
gluon lielicity. Ignore all masses. Use explicit polarization vectors and spinors, 
for example, 


= _(o,i,i,o) 


for a right-handed gluon moving in the +3 direction. You need only consider 
transversely polarized gluons. By lielicity conservation, only the initial states 
and Qr 5 £ can contribute; by parity, these two states give identical cross 
sections. Thus it is necessary only to compute the amplitudes for the three 
processes 

9£<7r -t 9R9R, 
qL~q r 9R9L, 

9L9R -4 9L9L- 


In fact, by CP invariance, the first and third processes have equal cross sections. 
After computing the amplitudes, square them and combine them properly with 
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color factors to construct the various lielicity cross sections. Finally, combine 
these to form the total cross section averaged over initial spins and colors. 

(b) Compute the differential cross section 

^(Q9 -t 99 ) 

at 

for gluon-gluon scattering. There are 16 possible combinations of helicities, but 
many of them are related to each other by parity and crossing symmetry. All 16 
can be built up from the three amplitudes for 

9R9R 9R9R > 

9R9R -> 9R9L, 

9R9R “> 9L9L- 

Show that the last two of these amplitudes vanish. The first can be dramatically 
simplified using the Jacobi identity. When the smoke clears, only three of the 16 
polarized gluon scattering cross sections are nonzero. Combine these to compute 
the spin- and color-averaged differential cross section. 

17.4 The gluon splitting function. Compute the gluon splitting function (17.130) 
for the Altarelli-Parisi equations. To carry out this computation, first compute the 
matrix elements of the three-gluon vertex shown in Fig. 17.20 between gluon states of 
definite lielicity. Combine these to derive the splitting function in the region x < 1. 
Then fix the singularity of the splitting function at x = 1 to give this function the 
correct overall normalization. 

17.5 Photoproduction of heavy quarks. Consider the process of heavy quark 
pair photoproduction, 7 + p — > QQ + X, for a heavy quark of mass M and electric 
charge Q. If M is large enough, any diagram contributing to this process must involve 
a large momentum transfer; thus a perturbative QCD analysis should apply. This idea 
applies in practice already for the production of c quark pairs. Work out the cross 
section to the leading order in QCD. Choose the parton subprocess that gives the 
leading contribution to this reaction, and write the parton-model expression for the 
cross section. You will need to compute the relevant subprocess cross section, but this 
can be taken directly from one of the QED calculations in Chapter 5. Then use this 
result to write an expression for the cross section for 7 -proton scattering. 

17.6 Behavior of parton distribution functions at small X. It is possible to 
solve the Altarelli-Parisi equations analytically for very small x , using some physically 
motivated approximations. This discussion is based on a paper of Ralston.t 

(a) Show that the Q 2 dependence of the right-hand side of the A-P equations can 
be expressed by rewriting the equations as differential equations in 

£ = loglog(p-), 

where A is the value of Q 2 at which a s (Q 2 ), evolved with the leading-order /3 
function, formally goes to infinity. 


tj. P. Ralston, Phys. Lett. 172B, 430 (1986). 
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(b) Since the branching functions to gluons are singular as 2 1 as 2 —5- 0, it is rea¬ 
sonable to guess that the gluon distribution function will blow up approximately 
as ,t _1 as x —> 0. The resulting distribution 

dxfg(x) ~ ^ 

is approximately scale invariant, and so its form should be roughly preserved by 
the A-P equations. Let us, then, make the following two approximations: (1) the 
terms involving the gluon distribution completely dominate the right-hand sides 
of the A-P equations; and (2) the function 

§(a',<2 2 ) = vfg(x,Q 2 ) 

is a slowly varying function of x. Using these approximations, and the limit 
x —5- 0, show that the A-P equation for f g (x) can be converted to the following 
differential equation: 

where w = log(l/.r) and b = (11 — Show that if w£ i$> 1, this equation 

has the approximate solution 

« 48 i 1/2 \ 

i'„:_ J, 

where K(Q 2 ) is an initial condition. 

(c) The quark distribution at very small x is mainly created by branching of gluons. 
Using the approximations of part (b), show that, for any flavor of quark, the 
right-hand side of the A-P equation for f q (x) can be approximately integrated 
to yield an equation for q(x) = xf q (x): 

Show, again using i§> 1, that this equation has as its integral 

(d) Ralston suggested that the initial condition 

K(Q 2 ) = 50.36(exp(£ - ^o ) - 0.957) • exp [-7.597(^ - <f 0 ) 1/2 ], 

with Qq = 5 GeV 2 , A = 0.2 GeV, and iif = 5, gave a reasonable fit to the 
known properties of parton distributions, extrapolated into the small x region. 
Use this function and the results above to sketch the behavior of the quark and 
gluon distributions at small x and large Q 2 . 



Chapter 18 


Operator Products and Effective Vertices 


Our analysis of QCD in Chapter 17 was founded on the principle of asymptotic 
freedom, which told us that strong interaction processes with large momentum 
transfer might reliably be treated in weak-coupling perturbation theory. So 
far, however, we have made little use in QCD of the more powerful tools of the 
renormalization group. In this chapter, we will work out some implications of 
the Callan-Symanzik equation in QCD. We will see that asymptotically free 
theories have their own characteristic scaling behavior, with corrections in 
the form of anomalous powers of logarithms of the momentum scale. Though 
these corrections are generally weaker than those in the scalar field theories 
studied in Chapter 13, they nevertheless have important qualitative effects on 
the strong interactions. 

We begin by considering the scaling law for mass terms in QCD, taking 
over directly the formalism that we used to describe the mass term of ( ft 4 theory 
in Sections 12.4 and 12.5. Other applications, however, require a more power¬ 
ful theoretical tool, the operator product expansion. Section 18.3 introduces a 
general description of products of operators in quantum field theory and ex¬ 
plains how such operator products are constrained by the Callan-Symanzik 
equation. The last two sections use this tool to develop a new viewpoint to¬ 
ward deep inelastic scattering and other hard processes in QCD. 

18.1 Renormalization of the Quark Mass Parameter 

Up to this point, we have always assumed that quark masses are small enough 
that they can be ignored in high-energy processes. This is not always an 
adequate assumption even for the light quarks u, d, s ; for the heavier quarks c, 
b, t, the masses can have very important effects. However, since isolated quarks 
do not exist, it is not possible to define the mass of a quark unambiguously. In 
the discussion to follow, we will consider the quark mass to be a parameter of 
QCD perturbation theory, defined by a renormalization prescription at some 
renormalization scale M. 

Because we define the quark mass as we would a coupling constant, by 
a renormalization convention, we should expect that this parameter will run 
according to a renormalization group evolution, so that different values of the 
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mass parameter apply to different processes. We say that our original pre¬ 
scription leads to an effective quark mass, which depends on the momentum 
scale at which it is evaluated. In this section, we will work out the leading 
dependence of this effective mass on the momentum scale. 

The basic formalism for effective mass terms was set out in Section 12.5. 
To add a mass term to the QCD Lagrangian, we must first define the mass 
operator (qq) by a renormalization prescription at a scale M. Then we can 
define the quark mass by adding to the Lagrangian the term 


A C m = ~m(qq) M . 


(18.1) 


In this discussion, we will assume that the quark mass to is small enough 
that we need only keep terms of leading order in to. We will also assume, for 
simplicity, that we have such a mass term for only one quark flavor. 

In the zero-mass limit, Green’s functions of the operator (qq) with quark 
fields, 

G (n ’ k) (./•,. x n ,y \,..., y n ,.2 1 ,..., z k ) ^ 

= {q{x i) • • • q(x n )q(yi) ■ ■ ■ q(y n )qq(z l)''' m(zk ))» 
obey the Callan-Symanzik equation 

[Mjm + 0^- + 2n 7 + j {y,} : { Zj },g, M) = 0, (18.3) 

where 7 is the anomalous dimension of the quark field and 7 5g is the anomalous 
dimension of the operator qq. If we include the mass terms in the Lagrangian 
according to (18.1), the Green’s function of n quark fields and n antiquark 
fields satisfies 

OO O 

+ ^~dg +2n T + 7 w m ^\ G{n) ({xi),{yi},g, m , M ) = 0. (18.4) 

The derivative with respect to m counts the number of times the mass operator 
is used. In Section 12.5, we traded the variable to, with the dimensions of 
mass, for a dimensionless variable. However, in QCD, it is just as convenient 
to consider the dimensionful parameter to as a coupling constant. The solution 
of the Callan-Symanzik equation will then contain a running mass parameter 
Uj(Q). which depends on a typical momentum Q of the Green’s function. 
This parameter is defined as the solution to a renormalization group equation 
analogous to Eq. (12.126). For this case, the equation is 


d 

-777 

d\og(Q/M) 


7 qq(a) ' IH, 


with the initial condition 


m(M) = to. 


(18.5) 

(18.6) 


The quantity rn(Q) is the effective mass, which should be used to compute the 
mass effects on quark production or scattering processes with the momentum 
transfer Q. 
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To compute fn(Q ) explicitly, we need to work out the anomalous dimen¬ 
sion of the mass operator 7 qq . This can be done as explained in Section 12.4. 
We define the normalization of the operator explicitly by the prescription that 
the vertex function of ( qq ) between renormalized quark fields should satisfy 


(18.7) 


for p 2 = q' 2 = (p 4 - q ) 2 = —M' 2 . To preserve (18.7), we will need a counter¬ 
term vertex S qq with the structure of the operator insertion. Then, as in Eq. 
( 12 . 112 ), the anomalous dimension is given to one-loop order by 

7 9 7=M^H ( 73 +< 5 2 ), (18.8) 

where 82 is the counterterm for the quark field strength renormalization, de¬ 
fined in Fig. 16.8. Correlation functions of the gauge-invariant operator (qq) 
are gauge invariant, and so the various terms in the Callan-Symanzik equation 
for this function must sum to a gauge-invariant result. Since the leading coef¬ 
ficient of j3(g) is independent of the gauge and of other conventions, it follows 
from (18.3) that the leading coefficient of ^f qq is also convention independent. 
The counterterms 82 and 8 qq both depend on the gauge. This argument shows 
that the gauge dependence must cancel in (18.8). In the calculation to follow, 
and in the other anomalous dimension calculations in this chapter, we will 
work consistently in Feynman-’t Hooft gauge. 

We have already computed the divergent part of the counterterm 82 in 
Feynman-’t Hooft gauge in Section 16.4. Evaluating the group-theory factor 
in the result (16.77) for QCD, we find 

* - 4 fl 3 r ( 2-f) 

62 3 (4tt ) 2 ( M 2 ) 2 - 11 2 ' ( 8 ' 9) 

To compute 8 qq , we must work out the one-loop correction to the vertex 
(18.7). This is given by the diagram 


r d 4 k 

J JW 


( ig) 2 t a r 


+ i) 

(k + q) 2 



—i 

(k — p) 2 


(18.10) 


In the expression for this diagram, the factor 1 represents the qq operator 
insertion. In the corresponding diagram for the renormalization of the quark 
number current j v = < 77 " q, this factor would be replaced by 7 ". Since we 
need only the divergent part of the vertex renormalization (18.10), we can 
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approximate the integrand by its value for large k. Then this diagram becomes 


/ 


d 4 k .o „ ,, iU , iU „ —i 


-l-(f 


/ 


d 4 k d-k 2 


: 5 2 • 4 ■ 


(2tt ) 4 (A : 2 ) 3 

rT(2-f). 


(18.11) 


(4tt) : 


To preserve the normalization condition (18.7), we must add the counterterm 

" 2 T(2-|) 


= _ 4 ._ 

9g 3 ( 47 r ) 2 (M 2 ) 2 ~ d / 2 ' 


Assembling (18.8), (18.9), and (18.12), we find 

c 2 

7«« = 


5 “ 


(47t) 2 


(18.12) 


(18.13) 


As we have noted in the previous paragraph, the anomalous dimension 
7 j of the quark number current can be found by a very similar calculation. 
This will give a good check on our formalism, since, as we have argued above 
Eq. (12.110), a conserved current is unambiguously normalized by its integral, 
the conserved charge, and so must have zero anomalous dimension. If we 
substitute 7 " for 1 in (18.10) and use the same set of approximations to 
reduce the integral, we find in the numerator the Dirac matrix structure 


= J(-2)W. 

Then, instead of (18.12), we need the counterterm 

c 4 ff 2 T(2-f) 

J 3 (4tt ) 2 (M 2 y 2 ~ d / 2 ' 

Combining this result with (18.9), we find 


(18.14) 


(18.15) 


7j = 0, 


(18.16) 


in accord with our general arguments. 

If we replace the gamma function in (18.11) by an explicit factor of 
log(A 2 /Q 2 ), and then subtract the divergence using the counterterm (18.12), 
we find that the vertex diagram behaves as 


4 

3 




(18.17) 


This diagram gives an enhancement at small external momenta. Some of this 
enhancement is associated with the (gauge-dependent) rescaling of the ex¬ 
ternal quark fields; relation (18.8) tells us how to extract the piece of this 
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Figure 18.1. Diagrams giving the leading logarithmic contributions to the 
momentum dependence of the quark effective mass. 


logarithm associated with the gauge-invariant enhancement of the effective 
mass. Thus, to order a s , 

m(Q)=m-( 1 + log gr) • (18.18) 


To compute the momentum dependence of the effective mass more ac¬ 
curately, we must take two more features of the calculation into account. 
First, the quantity (a s \og(M 2 /Q 2 )) may become of order 1, and, in this 
case, we must take into account all leading logarithmic terms of the form 
(a s \og(M 2 / Q 2 )) n . Contributions of this type come from all the diagrams 
shown in Fig. 18.1. Second, the coupling constant a s is itself a function of the 
momentum scale, giving a further enhancement to contributions from small Q. 
Both of these effects are properly accounted by solving the renormalization 
group equation (18.5). To the leading order in g 2 , this equation takes the 
explicit form 


d 

- Til 

d\og(Q/M) 



a s (Q 2 )- 

-TO 


(18.19) 


Inserting the solution of the renormalization group equation for q in the form 
(17.17), we find 

d\og(Q/M) m b 0 log(Q 2 /A 2 ) m ’ (18.20) 

where bo is the first coefficient of the QCD [3 function and A is now the QCD 
scale parameter defined in (17.16). The integral of this equation, satisfying 
the initial condition (18.6), is 


m(Q 2 ) 


/log(M 2 /A 2 )\ 4/io 
[log(Q 2 /A 2 )J m 


(18.21) 


Recall that bo = 11 — | rif in QCD. Another way to express (18.21) is by- 
writing 


m(Q 2 ) 


( a s (Q 2 ) 

U s(M 2 ) 



(18.22) 


Just as an illustration, take rif = 4 and A = 150 MeV; then the effective 
masses of the light quarks increase by about a factor 2 from Q = 100 GeV to 
Q = 1 GeV. 
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The method we have just used for computing the QCD enhancement of 
the quark mass operator applies equally well to the matrix elements of any 
other gauge-invariant operator. We conclude this section by recapitulating the 
conclusions of the argument in their more general form. 

Let 0(x) be any gauge-invariant operator in QCD. As we saw for the mass 
term, the one-loop corrections to the matrix elements of this operator may 
contain enhancement or suppression terms proportional to 07 log (M 2 /Q 2 ), 
where Q is the momentum scale of a QCD process mediated by O(x) and 
M is the renormalization scale used to define the operator normalization. 
The part of these one-loop corrections specifically associated with the opera¬ 
tor normalization is given by the anomalous dimension 70 . For an operator 
containing n quark or antiquark fields and k gluon fields, 

7 o = MjT-j So + ^ d ' 2 + 2^ 3 ) ’ (18.23) 

where 60 is the counterterm needed to preserve the operator normalization 
condition and So and 63 are the counterterms for the quark and gluon field 
strength renormalization defined in Fig. 16.8. From (18.23), we can derive the 
explicit one-loop expression for 7 o in the form 

10 = ~ a °jhy~' (18 ' 24) 

Using this result, we can solve the renormalization group equation for the 
coefficient of 0{x) and find the QCD renormalization factor 


/ log(M 2 /A 2 ) y o/26 ° 
V log(Q 2 /A 2 ) ) 


(18.25) 


where bo is the first coefficient of the QCD /3 function, 

&o = ll-|n/- (18.26) 

The QCD renormalization (18.25) is an enhancement at small momenta if 
ao > 0 . 

In the remainder of this chapter, we will present further examples of this 
enhancement or suppression by QCD logarithms. In many cases, we will see 
that these factors lead to striking and nontrivial physical effects. 
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18.2 QCD Renormalization of the Weak Interaction 


Our next example of the appearance of QCD enhancement factors occurs in 
the theory of the weak interactions of hadrons. In Section 17.3, we introduced 
the weak interaction coupling of quarks and leptons, which we described by 
an effective Lagrangian. For our analysis here, we will need to know a few 
more details of the structure of the weak interactions, so we begin this section 
by presenting these facts. The complete structure of the weak interactions of 
quarks and leptons will be discussed systematically in Chapter 20. 

As we discussed in Section 17.3, the weak interactions among quarks and 
leptons are described by an effective Lagrangian resulting from the exchange 
of a virtual W vector boson. In (17.31), we wrote the effective vertex that 
couples quarks to leptons: 

= + (18 ' 27) 

In this chapter, we will mainly be concerned with the effects of this interaction 
for momentum scales much larger than 1 GeV. Thus, we will ignore quark 
masses. All fermion fields that appear in the weak-interaction vertices are 
multiplied by the left-handed projector ^(1 — y 5 ). In the rest of this section, 
we will not write this projector explicitly; rather, we will denote the projection 
by a subscript L. We will also introduce the Fermi constant, given by (17.32). 
Then (18.27) can be rewritten as 


AC 1 

AC = -A=-({ L y‘v L ) (ulJuxIl) + h.c. 


(18.28) 


There is an analogous vertex that represents W exchange between pairs of 
quarks; this has the form 


AC = 


4 G f 


( (IlYul ) (ulJ/jcI.l) + h.c. 


(18.29) 


However, for the discussion of this chapter, we will need to write a mod¬ 
ified, and less approximate, expression. When we discuss the theory of weak 
interactions in detail in Chapter 20, we will learn that the charge +2/3 quarks 
(u,c, f) couple to the charge —1/3 quarks ( d,s,b ) through the weak interac¬ 
tions via a unitary rotation. Thus, for example, u couples to the combination 


cos 6 c d + sin 6 C s, 


(18.30) 


plus a small admixture of 6, which we will ignore in this section. The mixing 
angle 0 C is called the Cabibbo angle. Because of this rotation, the weak in¬ 
teraction effective Lagrangian coupling quarks to quarks actually contains a 
number of terms, of which a particularly important one is 

4G f 

AC = —=- cos 0 C sin fl,.((// ''' a /.)(<//.y,,*/.). 


(18.31) 
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This term allows the s quarks to decay through the process s —» uud. Similarly, 
the rotation of (18.28) produces the effective interaction 

4 Gf 

A£ = -j=- sme c (( L y i v L )(u L j^s L ), (18.32) 

which leads to the decay s —> uvl. These weak interaction processes are re¬ 
ferred to as nonleptonic and semileptonic decay processes, respectively. Sim¬ 
ilar expressions apply to the other heavy quarks. 

Given that (18.31) and (18.32) describe the weak interaction coupling of 
the s quark at a fundamental level, we now discuss the modification of these 
couplings by QCD logarithms. We have seen in the previous section that QCD 
corrections have a profound effect in enhancing the strength of the quark 
mass term of the underlying Lagrangian. We will now investigate whether the 
strength of the weak interactions can receive a similar enhancement. 

We first consider the semileptonic weak interaction operator (18.32). The 
leptonic fermion bilinear is not affected by QCD, so the QCD enhancement 
of this operator is just the same as that of its quark component 

(18.33) 

However, this operator is a current and so has 7 = 0. In terms of diagrams, 
the logarithmic enhancement resulting from the diagram shown in Fig. 18.2 
is canceled by the quark field-strength renormalization, as we saw already in 
our discussion of the current vertex is Section 18.1. The left-handed projector 
4(1 — 7 5 ) commutes through the diagram and has no effect on the final result. 
The same remark applies to the semileptonic weak interaction that links u 
and d quarks. It implies, for that case, that the normalization of the cross 
sections for deep inelastic neutrino scattering given in (17.35) is not affected 
by QCD logarithms. 

In the case of nonleptonic weak interactions, however, the effect of QCD is 
not so simple. Let us first compute the Feynman diagrams that give the leading 
corrections to the renormalization of the weak interaction vertex (18.31) and 
then, at a later stage, build up the renormalization group interpretation of 
these results. 

At order a s , the nonleptonic weak interaction vertex receives corrections 
from the diagrams shown in Fig. 18.3. Notice that the first diagram is pre¬ 
cisely the current renormalization found in the semileptonic case. The second 
diagram gives the analogous renormalization of the second quark current. In 
the computation of 7 , these two contributions cancel the contributions from 
the field-strength renormalization of the four quark fields. The remaining four 
diagrams of Fig. 18.3 are new contributions which contribute potentially large 
rescaling factors. 

We now compute these diagrams, beginning with the third diagram in Fig. 
18.3. As in the computation of Section 18.1, we are interested in the logarith¬ 
mically divergent contribution associated with values of the loop momentum 
k much larger than the external momenta. The simplest way to extract this 
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Figure 18.2. QCD correction to the strength of the semileptonic weak in¬ 
teraction vertex. 


Figure 18.3. QCD corrections to the strength of the nonleptonic weak in¬ 
teraction vertex. 


contribution is to compute each diagram in the approximation of zero exter¬ 
nal momentum. In writing the expression for these diagrams, we will omit the 
prefactor 


4G f 


cos 0 C sin# c . 


(18.34) 


We will retain the quark fields to represent the external states, so that our 
final expressions will have the form of rescaled operators. 

Using this notation, the third diagram in Fig. 18.3 has the value 

= / • (18.35) 

Using the symmetry of the k integral, we extract the divergent piece: 


= iff 2 j { u LlJ a l\ln.s L ) 


(18.36) 


To put the product of quark fields into a more familiar form, we apply the 
Fierz transformation discussed at the end of Section 3.4. If the color matrices 
t a were not present, the product of fermion fields would be exactly the one 
appearing in (3.82), and we would find 

(ml7v7a7/(Si) =1 Ml1 v u L u L ^ v s L . (18.37) 


The matrices t a redirect the color quantum numbers of the quark fields. 
To clarify this, we need the analogue of identity (3.77) for color. To find this 
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identity, consider the color invariant 

(' t a )ij{t a ) kl . (18.38) 

The indices i. k transform according to the 3 representation of color; the 
indices j, ( transform according to the 3. Thus, (18.38) must be a linear 
combination of the two possible ways to contract these indices, 

ASitSkj + BSijSkt- (18.39) 

The constants .4 and B can be determined by contracting (18.38) and (18.39) 
with Sij and with 5j k and adjusting .4 and B so that the contractions of 
(18.39) obey the identities 

tr[t a ](t a )ke = 0; {tH a )u = (18.40) 

This gives the identity 

(t a )ij(t a ) ke = | {8 u S kj - | SijSkt ). (18.41) 

A similar relation holds for the generators of SU(N ) in the fundamental rep¬ 
resentation, with (1/3) replaced by (1/1V) in that case. 

Inserting (18.41) into (18.36), we find that the first term of the identity 
generates a new four-fermion operator, 

{duYl X l ,l u L j) (ml , (18.42) 

where i, j are color indices. Applying the Fierz rearrangement in (18.37), 
and then applying the additional rearrangement (3.79), we can convert this 
operator to the form 

1 b(d l j' '', I(1/ ■ s/, I = 16(7/ hj ' "« i.j )(di■ */.,). (18.43) 

The minus sign in (3.79) is compensated by a minus sign from interchanging 
the order of fermion fields. The final result is a product of color-singlet quark 
currents; however, the fields in these currents are associated differently from 
the original operator. 

The final result of our evaluation of this diagram is 

o T(2-i) 

= ~4 g 2 |y»/.?'' a /.d/.- \d L YuLUVhiSL\ ■ (18.44) 

The fourth diagram of Fig. 18.3 gives precisely the same contribution. 

The evaluation of the last two diagrams in Fig. 18.3 is quite similar. The 
fifth diagram gives 

= / 4 ^ ii9)2 ^^ LYta ^ U ^ 

= ~'ig 2 J -^^J^{dLYt a l X Y'uL){u L ^,,t a 'y X 'y v SL) 

= + x r ( 4 7 r) 2 ) ( d L i't a -i > '’fu L ) ( u L "f t :t a j\'y„s L ). 


(18.45) 
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The four-fermion operator can be simplified as follows, by the use of the Fierz 
identity (3.79): 

{d L 7 l ' , wfuL) (ul1h1\1isS L ) = ) ( Ti //s„ ii i.) 

= +(-2) 2 (d L j lx s L ) (u L %u L ) (18.46) 

= 4:(d L j^u L )(u L J^s L ). 

Again, we must reduce the product of color matrices using identity (18.41), 
and, again, the first term of this identity will require an additional Fierz 
transformation. The final result is 

^r( 2 —|) 

= + 9 2 I«/.d/,c.,,.s/,. - jrr// / y‘(// «/ ',,.s/ ^. (18.47) 


The last diagram in Fig. 18.3 gives an identical contribution. The sum of the 
contributions from these four diagrams is 

o r( 2 —4) 

—3 g 2 [uLi^ULdLhxSL ~ yd/.y'' ii . (18.48) 


The extraction of the ultraviolet-divergent pieces of the diagrams of Fig. 
18.3 is part of our formal prescription for computing the Callan-Symanzik 7 
function of the weak interaction vertex. However, it is useful to pause at this 
point and ask about the physical significance of this divergence. The diagrams 
of Fig. 18.3 would not be divergent if we computed them in the underlying 
theory with W bosons. In writing the weak interaction as an effective local 
vertex, we approximated the W boson propagator by a constant, assuming 
that the momentum k that it carried was much less than mw'- 


1 -1 

k 2 - m 2 w m\ v ‘ 


(18.49) 


The approximation we used to compute the QCD corrections to the effective 
vertex is valid only in the region of integration where k 2 -C Outside this 
region we must use the full W propagator; this introduces an extra factor 
of k 2 in the denominator and makes the integral converge. Thus, in a direct 
calculation of the QCD correction, the ultraviolet-divergent terms in the eval¬ 
uation of Fig. 18.3 would be replaced by logarithms cut off at mw- The lower 
limit of the logarithm is set by the external momenta. In the decay of a K 
meson—the lightest hadron containing the s quark—these are of order nix. 
Thus the correction given in (18.48) should be evaluated by replacing 


2 F(2—7) a s m 


2 

w 


m 


(18.50) 


K 


With this interpretation, we can rewrite (18.48) as the order-o s correction 
to the leading-order weak interaction vertex. The effect of this correction is 
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the rescaling and modification of the weak interaction operator: 
(IlYulUl^hSl -> 

(l + 7 1 log 7 -^-)dL'y IJ ’ULU L Ju,s L - log 7 -^r-)u L 'y II u L d L j IJ s L . 

(18.51) 

Notice that the QCD corrections not only rescale the normalization of the 
original operator but also introduce a new operator with a different struc¬ 
ture. This calculation makes concrete the idea introduced in Section 12.4 that 
the diagrams that change the normalization of local operators may also mix 
together different operators with the same dimension and quantum numbers. 

Since the value of the logarithm in (18.50) is about 10, the size of the lead¬ 
ing QCD correction is of order 1 and so higher-order corrections are important. 
To sum the leading logarithmic corrections, we return to the renormalization 
group analysis. For clarity, define 

O 1 =d L ^u L ULl»SL] O' 2 = u L y i u L d L 'f ll s L . (18.52) 

We will use the subscript 0 to denote bare operators and the subscript M to 
denote operators obeying renormalization conditions at the scale M. From the 
diagrams of Fig. 18.3, we have found that the operator whose matrix elements 
have the quark structure of O 1 , properly normalized at the scale M, is given 

by 

O 1 M =O 1 0 + S 11 O 1 0 + S 12 O 2 , (18.53) 

where the <5' J are counterterms, 

hi = r r (2-f) . 12 = r ng-j) 

( 47 t ) 2 (M 2 )' 2 ~ d / 2 ’ ( 47 t ) 2 (M 2 ) 2 ~ d / 2 

A reciprocal calculation gives (D 2 M in terms of bare operators: 

0 2 M =0 2 +8 21 0l+8 22 0l, 

with 

^ 21 = A' 12 , ^=8 U . 

Then, in the manner than we discussed in Eq. (12.109), the operator rescaling 
of O 1 and O 2 is described in the Callan-Symanzik equation by a matrix yb 
linking the two operators. Expanding this equation to first order in g 2 , we see 
that this matrix is given to one-loop order by 

= (18-56) 

Thus we find 

= -Jl- (~ 2 6 ^ 

/ (4?r) 2 V 6 -2/ 

acting on the space of operators O 1 , O 2 . 


(18.54) 

(18.55) 


(18.57) 
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The simplest way to deduce the physical effects of the rescaling described 
by (18.57) is to diagonalize this matrix and thus find a new basis of operators 
that are rescaled without mixing. For the matrix (18.57), the eigenoperators 
are easily seen to be 

O 1 ! 2 = ^[d^ULULlnSL -UL^ULdL'JuSl], 

_ _ (18.58) 

O 2 ! 2 = \ [d L ^ l U L ULlnSL + U L YuLdLl^S L ]- 

The superscripts on these operators are their isospin quantum numbers. The 
operator O x ! 2 is antisymmetric under the interchange of the labels d and u; 
thus, these two isospin- 1/2 fields are combined to total isospin zero, and so 
the whole operator is isospin-1/2. This operator can mediate decays of the K 
meson that change the isospin by 1/2 unit, such as K° —17T + 7r - , but not pro¬ 
cesses that change the isospin by 3/2, such as K + —1 7 T + 7 r°. Experimentally, 
processes of the former type occur almost a thousand times faster (an obser¬ 
vation called the A/ =1/2 rule). Thus, it is interesting that the hard QCD 
corrections already make a distinction between these operators. 

From the eigenvalues of (18.57), we obtain the Callan-Symanzik 7 func¬ 
tions of the eigenoperators (18.58): 

2 2 

71/2 = -8 (4^F ; 73/2 = +4 jhy' (18 - 59 ) 

According to Eqs. (18.24) and (18.25), this implies that the operator 0 1 / 2 
receives an enhancement from hard QCD logarithms, while the operator 0 3 / 2 
receives a suppression. More explicitly, we can write the operator that appears 
in the original nonleptonic weak interaction vertex (18.31) as 

[drfuLUL^s L \ | mw = [ 0 1/2 ] | mw + [ 0 3/2 ] \ mw . (18.60) 

As above, the subscript refers to the mass scale at which the operator is 
normalized. We now account for the QCD logarithms associated with evalu¬ 
ating the matrix element of this operator at a lower momentum scale, mu, 
by replacing the operators on the right-hand side of (18.60) with operators 
renormalized at mx, with the rescaling factor (18.25). This gives 


[dLj^ULULj^SL] 


mw 


/ log(TO^/A 2 ) \ 4/j>0 r 1/21 
\ l°g( TO i:/A 2 ) / km K 

+ (wrwvrf'V' 3 ] 

\\og{m- K /A-) J 


(18.61) 


where, again, 6 0 = 11 — |n/. This equation shows that, unlike the case of 
semileptonic weak interactions, the overall normalization of the effective La- 
grangian for nonleptonic weak interactions is changed by QCD logarithms. In 
addition, the quark structure of the effective Lagrangian is altered. 
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Quantitatively, taking nf =4 and A = 150 MeV as an illustration, we 

find 


[d L -fu L u L ^8 L ] =2.l[0 1 /2] + 0.7 [CO 3 / 2 ] 


(18.62) 


Thus, the QCD logarithmic corrections from rn\y to tuk give the A I = 1/2 
part of the effective vertex an enhancement of about a factor of 3.* The 
observed A I = 1/2 rule in K decays requires a factor of 20 enhancement. 
However, part of this is expected to arise from the ratio of the matrix elements 
of the operators Cm/ and 0 Z J % ~ K between physical hadron states, which are 
determined by the soft, nonperturbative part of QCD dynamics. 


18.3 The Operator Product Expansion 

One way to describe the development of the previous section is to say that 
we studied an interaction that was fundamentally a product of currents by 
replacing this product of operators with a single local operator. We then 
derived the physical consequences of the original, composite, interaction by 
working out the QCD rescaling of this operator. The procedure of replacing a 
product of operators with a single effective vertex is useful in many contexts 
in quantum field theory. Thus, in this section, we will pause from our study 
of QCD to write out the general formalism governing this procedure. 

Let us abstract the situation described in the previous section as follows: 
Consider a quantum field theory process that includes two operators 0\, 0 2 
separated by a small distance x, together with other fields </>((/,;) located much 
farther away, or together with external physical states. In the example above, 
the two operators are the quark currents that appear in the weak interaction 
vertex, and their separation x is a distance of order mjj), the range of the 
W propagator. The external states, which contain K and it mesons, can be 
described by operators that create and destroy these particles. The amplitude 
for K decay by the weak interactions, or any more general process of this 
class, can then be extracted from the Green’s function 

G V2 (x; -i/i,- ■■ ,y m ) = (O 1 (x)O 2 (0)4>(yi) ■ ■ ■ <t>(y m )), (18.63) 

considered in the limit x —> 0, with the y-, fixed away from the origin. Here 
and in the following discussion, products of operators will be considered to be 
time-ordered, just as we would find by writing the product of fields under the 
functional integral. 

The product of operators Oi(x)O 2 (0) can potentially create the most 
general local disturbance in the vicinity of the point 0. However, any such dis¬ 
turbance can be described as the effect of a local operator placed at 0. This 


*M. K. Gaillard and B. W. Lee, Phys. Rev. Lett. 33, 108 (1974); G. Altarelli and 
L. Maiani, Phys. Lett. 52B, 351 (1974). 
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local operator must have the global symmetry quantum numbers of the prod¬ 
uct of 0i02, but it is otherwise unrestricted. It is useful to write this operator 
as a linear combination of operators from a standard basis. The coefficients in 
this linear combination can depend on the separation x. Typically, products 
of operators in quantum field theory are singular, so it is likely that some of 
the coefficients will have singularities as x —> 0. Combining these observations, 
Wilson proposed that the effects of the operator product could be computed 
by replacing the product of operators in (18.63) with a linear combination of 
local operators, 

Oi{x)0- 2 { 0) ->• ^0 12 %t)0„(O), (18.64) 

n 

where the coefficients C 12 "(x) are c-number functions. This operator product 
expansion (OPE) will depend only on the operators 0i, 02, and their sepa¬ 
ration and will be independent of the identity and location of the other fields 
appearing in the Green’s function. 

The expansion (18.64) implies that the Green’s function (18.63) can be 
expanded for small x as follows: 

G 12 (x; yi, ■ ■ ■ ,y m ) = '^2p i2 n (x)G n (y li ■ ■ -y m ), (18.65) 

n 

where 

G n (yi, ■ ■ • , 2 /m) = (0 n (O)0(2/i) • • • 4>{y m )), (18.66) 

and all of the dependence on x is now carried by the OPE coefficient functions. 
In the example of the previous section, the final amplitudes depended in a 
rather involved way on the small separation of the two operators, through 
the dependence of the coefficients in (18.61) on mw- From the viewpoint of 
the operator product expansion, this dependence is carried by the coefficient 
functions and is determined for all matrix elements when these are computed. 

In Sections 18.1 and 18.2, we used the renormalization group to compute 
the enhancement or suppression factors for operator matrix elements. Thus it 
is natural to expect that the form of the operator product coefficients is also 
determined by the renormalization group. We will now work out this relation. 
To begin, we rewrite the expansion (18.64) more precisely. The operators that 
appear in this relation must be defined at some renormalization scale M. Then 
the operator product expansion reads: 

[0r (x)] M [02(0)] M = C 12 n (x; M ) [0 n (O)] M . (18.67) 

n 

Note that the coefficient functions can depend on M, since they must absorb 
the M-dependent operator rescalings. If we use the left-hand side of (18.67) 
to compute (18.63), this function obeys the Callan-Symanzik equation 

d d 

+ +m7 + 71 + 72 ] G 12 (25 2/i, • ■ ■, 2/m! M) = 0. (18.68) 
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Similarly, with the operator O n normalized at M, the Green’s function (18.66) 
obeys 

d d 

l M ~dM +l3 ~dg +w t-+ 7 n]G„(j/ 1 , • • -,y m \M) = 0. (18.69) 

By applying (18.68) to the right-hand side of (18.65), we see that these re¬ 
lations are consistent only if the OPE coefficient functions obey the Callan- 
Symanzik equation 

o o 

[m— + 0— + 71 + 72 - 7n] C 12 n (x-, M) = 0. (18.70) 

We now solve this equation by our standard methods. First, let us apply 
dimensional analysis. If the operators £>i, 0 2 , O n have dimensions di, do, d n , 
the coefficient function C 12 n (x) must have the dimensions of (mass) dl+rf2_<in . 
Thus, 

( 1 \dl-\-d2— d n ~ 

—) C(xM), (18.71) 

where C(xM ) is a dimensionless function. This function is determined from 
(18.70) according to the method of Section 12.3. Thus, 


C, 


■<*>=u 


\ \d\-\-d2—dn 


M 


c(g( l/x))exp 


d log M' ( 7 ,, - 71 - 70 ) 


1 /x 


, (18.72) 


with c(g) a dimensionless function of the running coupling constant at the 
separation scale 1/x. 

At a fixed point of the renormalization group, the 7 functions would take 
definite values 7 j* = 7 j(g*). Then, the solution (18.72) can be evaluated as 


= (r! 

d\-\-d2—dn f 1 

1 c(g*) exp l^log (xM) (7 „* - 71 * - 72*) J • 

(18.73) 

Thus, in this case, 

, x /I \d 1 -\-d 2 — rf* 



c ^> ~ (r) 

(18.74) 

where 

d*j = d. +7 jig*) 

(18.75) 

is the true scaling dimension of the operator Oj at the fixed point. 



For the case of an asymptotically free theory, the scaling relation is com¬ 
plicated in the way that we worked out in Section 18.1. In the leading order 
of perturbation theory, the three 7 functions take the form (18.24). Then the 
solution of (18.72) takes the form 






di+d 2 -d n /log(1 / 1| 2 A 2 ) 

1 1 _ „ / 71 r 9 7 A 9 \ 


(a n —ci 1 —a2)/26o 


(18.76) 
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In the example of Section 18.2, the original operators were currents with 
dimension 3 and 7 = 0, at separation mjj- 1 , and the final local operators had 
dimension 6. Thus, (18.76) does properly reproduce the dependence of (18.61) 
on mw ■ Notice that the renormalization group dependence is less complicated 
for a product of currents, which have a fixed normalization independent of 
scale. This special case occurs often in applications of the operator product 
expansion. 

We have written Eq. (18.70) without taking account of operator mixing. 
However, as we have already seen, operator mixing is often an essential part 
of the applications of the OPE. It is straightforward to include this effect by 
rewriting the analysis that leads to (18.70) using matrix-valued 7 functions. 
For example, with operator mixing, the Callan-Symanzik equation for G n will 
be modified to 

d d 

[<W ( M ~8M + ^~dg + mi ) + lnp \ Gp ( yi ’''' M ) = °- (18.77) 

With these changes, (18.70) becomes 

["^7 + ' J |] C ' 3 " ,I; " ) + T ' iC “” <X; " ) ( 18 .78) 

+ 72 kC lk n (x; M) - lkn C V2 k (x ; M) = 0. 

Notice that the first two 7 matrices act on the OPE coefficient from the left, 
while the third acts from the right. In the case of a product of currents, the 
first two 7 matrices vanish and (18.78) simplifies to 

O r\ 

[m— + /3—] C^(x; M) - C r k (x-, M) lkn = 0. (18.79) 

This equation will play an important role in the analysis of Section 18.5. 

18.4 Operator Analysis of e + e~ Annihilation 

It is not difficult to imagine that there is a connection between matrix elements 
in which currents are placed at short distances from one another and matrix 
elements in which currents deliver a hard momentum transfer. Thus we might 
expect that the idea of the operator product expansion will give us a new 
viewpoint from which to understand the theory of hard-scattering processes 
in QCD. In this section and the next, we will work out the relation of the 
operator product expansion to the perturbative QCD analysis of Chapter 17. 

We begin by discussing the total cross section for e + e _ annihilation to 
hadrons. Below Eq. (17.9), we argued that this total cross section could be 
computed in QCD perturbation theory, using a value of a s corresponding to 
the scale of the total center of mass energy. However, this argument was a 
purely intuitive one, with many logical jumps. In this section, we will give a 
more rigorous argument to the same conclusion. 
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Figure 18.4. Diagrams whose imaginary part yields the total cross section 
for e + e _ —S- hadrons. 


In order to invoke the operator product expansion, we must write the 
total cross section for e + e _ annihilation to hadrons as the matrix element of 
a product of currents. To do this, we use the optical theorem to relate the 
total e + e _ scattering cross section to the forward scattering amplitude for 
e + e _ —>■ e + e _ . Ignoring the mass of the electron, we see from Eq. (7.49) that 

a(e + e~) = — Im.M(e + e _ —> e + e _ ). (18.80) 

2s 

To compute the cross section for e + e _ —1 hadrons, we consider in the compu¬ 
tation of the imaginary part only the contributions from hadronic intermediate 
states. To leading order in a, but to all orders in the strong interactions, these 
contributions come from considering only diagrams of the form of Fig. 18.4, 
and taking the imaginary part of the hadronic contributions to the vacuum 
polarization. 

The value of the diagrams shown in Fig. 18.4 is 

iM = (-ie) 2 u(k)-/^v(k + ) — -(ill^(g))— -v(k + )~/„u(k),, (18.81) 

where s = q 2 and II (q) is the hadronic part of the vacuum polarization. By 
the Ward identity, this can be written 

n r(?) = (? 2 r-A , ')n*(« 2 ). (18.82) 


The q^q v terms give zero when contracted with the external electron currents, 
so only the g 111 ' term survives. To evaluate the electron spinor part of (18.81), 
we use the fact that, in this forward scattering amplitude, the initial and final 
momenta and spins are set equal. Then, averaging over the initial spin gives 


\ ■ \ u(k)y i v(k + )v(k + )~f fl u(k) = ^tr[^¥+ln] 

spins 

= | • (-2)-4(fc • k + ) 


(18.83) 


= — s. 


a(e + e —> hadrons) = —— Imlljjfs). 

s 


Thus, we find 


(18.84) 
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To check this result, we can look back to the one-loop value of II in QED 
(7.91), or to the imaginary part of this expression given in Eq. (7.92): 


Im n(s + ie) = — - 


: r 4m' 2 / 2m 2 \ 

A 1 -—( 1 + — )• (18 - 85 > 


Combining (18.85) with (18.84), we obtain the correct leading-order cross 
section for production of a new heavy lepton in e + e _ annihilation, 


<4 


, + o~ 


-)■ L+L~) = 


47TCT 
3 s 


r 4 m 2 / 2m 2 \ 

v 1 -—( i+ —) ,18 - 86) 


If we multiply (18.86) by a factor of 3 for color and sum over quark flavors 
with the squares of the quark charges, we obtain the leading-order prediction 
of QCD. 

Now that we have relation (18.84), we complete the connection we wished 
to prove by noting that the hadronic vacuum polarization is simply a matrix 
element of a product of currents. Let -J' J be the electromagnetic current of 
quarks, 




Qf- 


(18.87) 


Then 

ffl l v {q) = -e 2 Jd 4 xe iq ' x (0| T{j»(x) •7*'(0)} |0) . (18.88) 

In the limit in which the point x approaches 0, we can reduce the product 
of currents by applying the operator product expansion. Since we will be 
taking the vacuum expectation value of the product, we need only list the 
contribution from operators that are gauge-invariant Lorentz scalars. Thus, 

~ Cj[x) ■ l + C^(x)- qq (0) + C„f 2 (x)(F^no) + .... (18.89) 

Note that we have included the operator 1 on the right-hand side, and the 
next possible operators in QCD have dimension 3 and 4, respectively. Since 
the operator qq violates chiral symmetry, its coefficient function must have an 
explicit factor of the quark mass. Thus, by dimensional analysis, 

C f J~x~ 6 , Cj«~mx~\ Cj 2 ~x-' 2 , (18.90) 

and the higher terms in the series are less singular as x —» 0. 

To compute II (q), we need the Fourier transform of the product of cur¬ 
rents. Assuming that this Fourier transform is indeed dominated by the limit 
of short distances, we can compute it by Fourier-transforming the individual 
OPE coefficients. Since the currents are conserved, the individual terms in the 
OPE must give zero when dotted with q^. Thus the transformed OPE takes 
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the form 

Si (./•)./''(HI 

= -ie~(q 2 9^ ~ q^q v ) [cV) ■ 1 + c^(g 2 ) • mqq + c F \q 2 ) ■ { F' x ,)' + •••], 

(18.91) 

where the c® are Lorentz-invariant c-number functions of </ 2 , and the factor of 
i at the beginning of the second line is inserted as a convenient convention. 
By dimensional analysis, we find 

c 1 -^) 0 , ~ (g 2 )“ 2 , c F2 ~(g 2 )- 2 , (18.92) 

and the higher terms are more irrelevant for large q. 

The OPE coefficients c’(q 2 ) can be computed from Feynman diagrams. 
As shown in Fig. 18.5, the coefficient of the operator 1 is the sum of diagrams 
with no external legs other than the current insertions. The leading QCD 
diagram is just the simple vacuum polarization diagram, multiplied by the 
color factor 3 and the sum of the squares of the quark charges. Combining 
these factors with Eq. (7.91), we have 

c 1 ) q 2 ) = ~( 3 J^Qf) ■ 7^ log (~q 2 )- (18.93) 


The corrections to this result are of order a s (q 2 ). The higher coefficient func¬ 
tions are extracted from diagrams with more external legs. For example, the 
coefficient function of (F^) 2 is determined by diagrams with two external 
gluon legs. 

Still assuming that the Fourier transform of the product of currents can be 
computed from the OPE for the region of large timelike q 2 , we can complete 
our evaluation of the cross section for e + e _ —^ hadrons by taking the vacuum 
expectation value of (18.91), extracting the imaginary parts of the coefficient 
functions, and substituting the result into (18.84). We find 


rr(e + e —)■ hadrons) = - Time 1 );/ 2 ) + Imc w (g 2 ) (01 mqq 10) 

s 

+ Im c p2 (q 2 ) (0| (F^p) 2 |0) + ■■■]. 


(18.94) 


The first term of this series is just the result of summing perturbative QCD 
diagrams for the e + e _ total cross section. The additional terms give correc¬ 
tions to this result which depend on soft hadronic matrix elements, but these 
corrections are explicitly suppressed at high energy by factors (q 2 )~ 2 . (Inci¬ 
dentally, this expansion, which applies equally well in the absence of QCD 
interactions, explains why (18.86) contains no term of order s~ 2 when ex¬ 
panded for large s.) If we insert the leading-order expression (18.93) into 
(18.94), we obtain the familiar result 


a(e + e —»• hadrons) 



f 



(18.95) 
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Figure 18.5. Feynman diagrams contributing the operator product coeffi¬ 
cient, in the expansion of the product of currents, for the operator (a) 1; (b) 

ot; (c) (Kp) 2 - 

Our result (18.94) is pleasing, but the logic that led us to it was not 
correct. To compute the e + e _ total cross section, we must compute n/,(</ 2 ) 
in the region of large timelike momentum q, where the expectation value of 
the product of currents is dominated by intermediate states of high energy, 
involving large numbers of physical hadrons. Thus we need II/, (q 2 ) in precisely 
the region where it is not dominated by short-distance perturbations of the 
quark and gluon fields. To compute the product of currents from the short- 
distance expansion, we choose kinematic conditions such that the intermediate 
states that enter the computation of the product of currents are far off-shell, 
so that they cannot propagate far from the converging points x and 0. This 
condition is satisfied at large spacelike momentum, or, equivalently, at small 
spacelike separation. However, it seems at first sight that a computation in 
this region is useless for determination of the e + e _ cross section. 

Fortunately, there is a wonderful trick for relating the values of a quantum 
field theory amplitude in two well-separated kinematic regions. This trick, 
called the method of dispersion relations, makes use of the general analytic 
properties of the amplitude. Since (18.88) is the Fourier transform of a two- 
point correlation function, we know from the analysis of Section 7.1 that 
n h(q 2 ) possesses a Kallen-Lehmann spectral representation. Thus, n/,(g 2 ) is 
an analytic function of q 2 with a branch cut on the positive q 1 axis and no 
other singularities in the complex q 2 plane. This analytic structure is shown in 
Fig. 18.6. The discontinuity of n/,(</ 2 ) across the branch cut is (2 i) times the 
imaginary part of n/, and so is directly related to the total e + e _ annihilation 
cross section. 

With this additional knowledge about n/,(qr), we can argue as follows. 
Let q 2 = — Q'q be a value sufficiently far into the spacelike region of q that 
the Fourier transform of the product of currents can be computed from the 
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Figure 18.6. 


Analytic singularities of II/ l (g 2 ) in the complex q 2 plane. 


operator product expansion. Now consider the integral 

(,M6) 

for n > 1, evaluated on a contour encircling q 2 = — Qjp If we contract the 
contour onto the pole, we find 


1 d " 



(18.97) 


which can be computed by evaluating II/ t from the the operator product re¬ 
lation (18.91), 


n„(r) = -e 2 [cV) + c™(q 2 ) (0| mqq |0) + c F \q 2 ) (0| (F« g )' 2 |0> H-]. 

(18.98) 

On the other hand, we can evaluate the integral by distorting the contour to 
the form of Fig. 18.7. Since none of the coefficient functions grow faster than 
(q 2 ) 0 times logarithms as q 2 —» ex), the contour at infinity can be neglected 
for n > 1. The piece of the contour that wraps around the branch cut gives 

/ (^q 2 ^ 

il(FToiF Discn ‘ l<,S) 

f dq 2 i l , n \ 

= ~ im l 27 +0g)»+' ? 2,ll ° nt(r> (18.99) 

oo 

= — ds -—- r fT ( s )- 

kJ {s + Qo)" +1 

0 


This is an integral over the total cross section for e + e _ —>- hadrons. By equat¬ 
ing (18.97) and (18.99), we obtain a series of integral relations between the 
OPE coefficients, evaluated in QCD perturbation theory, and the observable 
cross section. These relations, which were first constructed by Novikov, Shif- 
man, Voloshin, Vainshtein, and Zakharov, are known as the ITEP sum rulesd 


fThe theory of these sum rules is reviewed in V. A. Novikov, L. B. Okun, M. A. 
Shifman, A. I. Vainshtein, M. B. Voloshin, and V. I. Zakharov, Phvs. Repts. 41, 1 
(1978). 
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Figure 18.7. Contour of integration involved in the derivation of the ITEP 
sum rules for cr(e + e~ —5- hadrons). 

Evaluating the sum rules with only the leading QCD expression for c 1 ( q 2 ), we 
find 

OO 

/ dS ( 8 + Qg)" +1 <7(8) = ^' Q ? + ° (aAQl)) + (18-100) 

o f 

The leading-order relation is consistent with the lowest-order cross section 
given in Eq. (18.95). The corrections come from higher orders of QCD per¬ 
turbation theory, with a s taken at the scale Q'q , and from the higher operator 
terms in the OPE. 

If the correction terms in (18.100) converged to zero uniformly in n, we 
could invert the sum rules and derive from them our result (18.94). However, 
the true situation is more subtle. Because the derivatives in (18.97) empha¬ 
size terms with stronger q 2 variation, the correction terms in the ITEP sum 
rules are more and more important as n increases. Thus the most important 
deviations of the cross section from the prediction of QCD perturbation the¬ 
ory are oscillations about this prediction, which average out in the sum rules 
for low n. The comparison of theory and experiment is shown in Fig. 18.8. At 
large s, (18.94) is quite accurate. As s becomes smaller, however, the oscilla¬ 
tions grow in size. Eventually, they come to dominate the total cross section 
as the resonances associated with quark-antiquark bound states. 

18.5 Operator Analysis of Deep Inelastic Scattering 

We now apply the operator product expansion to another example of a QCD 
hard-scattering process, deep inelastic electron scattering. In Chapter 17 we 
found that the predictions of QCD for deep inelastic scattering are precise 
but also intricate in structure. At a first level, QCD implies that deep inelas¬ 
tic scattering is described by the parton model, in which the incident electron 
scatters from quarks and antiquarks that carry fractions of the total momen¬ 
tum of the proton. These fractions are determined by parton distribution 
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Figure 18.8. Experimental measurements of the total cross section for the 
reaction e + e _ —S- hadrons at energies below 3 GeV, compared to the predic¬ 
tion of perturbative QCD for 3 quark flavors. The data are taken from the 
compilation of M. Swartz, Phvs. Rev. D53, 5268 (1996). Complete references 
to the various results are given there. 

functions, which reflect the form of the proton wavefunction and are deter¬ 
mined by soft QCD dynamics. However, we saw in Section 17.5 that effects of 
QCD perturbation theory cause the parton distributions to change their form 
as a function of the momentum transfer Q 2 . We will now show that much of 
this picture can be reconstructed from our new viewpoint, using the operator 
product expansion. 

In the previous section, we derived the OPE relations for the e + e _ an¬ 
nihilation cross section in three steps. First, we used the optical theorem to 
relate this cross section to a matrix element of a product of currents. Second, 
we applied the operator product expansion to the product of currents. Unfor¬ 
tunately, this expansion could be used only in an unphysical kinematic region. 
However, in the third step, we used the method of dispersion relations to con¬ 
nect this unphysical result to an integral over the cross section we wished to 
predict. In our discussion of deep inelastic scattering, we will go through these 
same three steps. To obtain our final result, we will need to add a fourth step, 
involving QCD operator rescaling. 
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Figure 18.9. Computation of tlie cross section for deep inelastic electron 
scattering: (a) general structure of tlie amplitudes; (b) application of the 
optical theorem. 

Kinematics of Deep Inelastic Scattering 

We begin by writing a general expression for the deep inelastic scattering cross 
section. The matrix element for deep inelastic electron scattering to a final 
state / is computed as shown in Fig. 18.9(a): 

iM(ep —1 ef) = (—ie)u(k')'y ll u(k) — (ie) j d i xe iq ' x (/| r) |P), (18.101) 

where J ,l (x) is the quark electromagnetic current (18.87). The core of this 
expression is the hadronic matrix element of the current between the proton 
and some high-energy hadronic state. This matrix element must be squared 
and summed over possible final states. That sum can be computed, using the 
optical theorem, by relating it to the forward matrix element of two currents 
in the proton state, as shown in Fig. 18.9(b). Define 

W^ = i J (P| T{ J»{x) J"(0)} |P), (18.102) 

averaged over the spin of the proton. This object is known as the forward 
Compton amplitude , since if it is evaluated at q 2 = 0 and contracted with 
physical polarization vectors, it gives the forward amplitude for photon-proton 
scattering: 

iM(yp -t 7 P) = ('ie) 2 e*(g)e„(g)(-' iW* v (P,q)). (18.103) 

However, in the following discussion we will need to analyze (18.102) for gen¬ 
eral spacelike q and for general polarization states. 

The optical theorem for Compton scattering from a proton is 
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In the generalization given in (7.49), this result extends to the more general 
situation in which the initial and final photon polarizations can differ arbi¬ 
trarily. Transcribing (18.104) to W MI/ , we find 

21m W^(P,q) = J2[Mf (P\ J»(-q) \f) (f\ J v (q) \P) , (18.105) 

f J 


where J^(q) is the Fourier transform of the current. 

We can now compute the deep inelastic cross section in terms of W ^ v , 
using (18.105) to represent the square of the last factor. The cross section 
should be averaged over initial and summed over final electron spins. Thus, 


a(ep —» eX) 


YsJ £ [mwlk'Mk'hMk)} 

v ' spins 

• (^) 2 -2Im W^(P,q). 


(18.106) 


The electron spinor product can be evaluated as 


I 22 [ u ( k hn u (k')u(k')j v u(k)] = itr[^ 7 „^' 7 „] 
spins (18.107) 

— 2 ( kfi Ay + k v k^ Qi±vk k ). 


It is useful to convert the integral over the final electron momentum k! and 
scattering angle 6 to an integral over the dimensionless variables x and y that 
we introduced in Section 17.3. These variables are given in terms of the initial 
and final electron energies k and k' by 


x = 


Q 2 

2P -q 


Then 


2kk'{l — cos 6) 
2 m(k — k ') 

d(x,y) 
d(k\ cos 8) 


_ 2 P-q _k-k' 
V ~ 2 P-k ~ k 

2k' _ ‘2P_ 

2 m(k — k ') ys 


and so 


r d :t k‘ i 

f ‘2ndk'k'dcos8 f 

/ ( 2 tt ) 3 2k' ~ j 

CO 

to 

1 


ys 

'(47r) 2 


Using (18.107) and (18.110) to simplify (18.106), we find 


(18.108) 


(18.109) 


(18.110) 


d 2 a 

dxdy 


(ep —> eX) 


2 a 2 y 

(Q 2 ) 2 


( k/ikj , T kj/k^ 


g ltv k-k , )hnW' tv (P,q). (18.111) 


To go further, we need to know something about the structure of . In 
the previous section, we used current conservation to write the matrix element 
of currents in terms of a single scalar function II^)^ 2 ), as in Eq. (18.82). In 
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the case of the forward Compton amplitude, the Ward identity again requires 

q„W^ = q v W liV = 0, (18.112) 


but now there are two possible tensors built from P and q that satisfy these 
constraints. Thus the forward Compton amplitude is written as an expression 
involving two scalar form factors: 

= (-(/" + ) Wi + (P" - (P v ~ (18.113) 


The scalar functions W\, W 2 depend on the two invariants of the problem, 
(P ■ q) and q 2 , or, alternatively, x and Q 2 . If we insert (18.113) into (18.111) 
and use the fact that dotting q 14 with the lepton tensor gives zero, we find 


d 2 a 

dxdy 


(ep —»■ eX) 


\2k -Pk 1 P Im W 2 +2k-k! Im Wj] 

\Q ) 

2 

[s 2 (1 — y) Im Wo -f 2 xys Im W\\. 


(18.114) 


Expression (18.114) is completely general and makes no assumptions 
about the nature of the strong interactions. It is also rather formal. How¬ 
ever, we can easily get an idea of the relation of this formula to our earlier 
analysis by evaluating W 1 "' in the parton model and working out the parton 
expressions for W\ and Wo. In the parton model, we replace the proton ma¬ 
trix element in (18.102) by a sum of quark matrix elements, weighted with 
the parton distribution functions. Thus, 


W 



iq-x 


Jdt'Eff® ■ j (q f (p)\T{.P'(x).r(0)} | q f (p)) 
o 1 


i p=tp 


(18.115) 

The factor (l/£) in front of the matrix element gives the proper normalization 
of the proton state in terms of the quark states. The simplest way to under¬ 
stand this factor is to note that the kinematic prefactor (1/2s) in (18.106) and 
in other expressions involving an initial-state proton becomes (1/2£s), under 
the £ integral, in the parton model. 

We now evaluate the matrix element in (18.115) using noninteracting 
fermions. There are two Feynman diagrams, shown in Fig. 18.10. The first 
diagram on the right in Fig. 18.10 has the value 


i J ff(O^Q}u(p) 7 " 
o f 


i(tf+ a 0 

(p + q) 2 + it 


l v u{p)\ 


(18.116) 


the second diagram gives a contribution identical to this one after the inter¬ 
change of q. p with (—q), v. To evaluate (18.116), we average over the quark 
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Figure 18.10. Evaluation of in the part on model, 

spin to find 


/ 


^ Y ■ \ tr[y7 M (y + 4)i v ] 

f ? 


-i 


2 p ■ q + q 2 + ie 


= J d ^Y //(^)|' 2 {p^(p + qY + pYp + <?)" - g^p■ (p + q)) 

n f 


-1 


2£P -q-Q 2 + ie 


(18.117) 


The imaginary part of this expression, which we need to evaluate (18.114), 
comes from the last factor in (18.117): 


Im 


-1 


v2 {P-q-Qi+ie 


-) = ttS(- 2^P ■ q — Q 2 ) = —6(£ - x ). (18.118) 

it / ys 


In the second diagram of Fig. 18.10, the two factors in the denominator have 
a relative + sign, so this diagram has no imaginary part in the physical region 
for deep inelastic scattering. Thus, we find that in the parton model, 

Im W^ = Y Q)ff{x)- — {Ax 2 P fl P v + 2;c(P'V + PV) - g^xys). 

• x ys 

(18.119) 

By adding and subtracting terms proportional to q^q v , we can see that this 
expression is of the form (18.113), with 

Im =irYQ'fff( x ), Im W 2 = — Y, Qf x ff( x )- (18.120) 

f VS f 


The parton model expressions for W\ and W 2 obey the relation 

ys 


Im W\ = — Tin II ',. 
4x 


(18.121) 


This is another form of the Callan-Gross relation, since the substitution of 
(18.121) into (18.114) gives 

d ° (ep -> eX) = YJY- [1 + (1 - y)' 2 ] Im W 2i 


dxdy 


2 Q 4 


(18.122) 
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with the y dependence characteristic of free fermions, as in Eq. (17.125). 
Finally, substituting from (18.120) for the imaginary part of HA, we recover 
this parton model expression precisely: 

-^-(ep -)■ eX) = {^2 Q'jxffi*)) [1 + (1 - y) 2 ] • (18.123) 

This equation will give us a reference point for comparison with more general 
expressions that we will derive as we continue our analysis. 

Expansion of the Operator Product 

Since the forward Compton amplitude is a matrix element of a product of 
currents, an alternative strategy for calculating W l “' is to expand this product 
as a series of local operators. Like the parton model evaluation, this method 
makes use of asymptotic freedom. However, in this case, the assumption is 
applied more directly. The computation of the operator product coefficients 
will take place explicitly at a small distance of order 1/Q, and so we can 
calculate these coefficients in a perturbation theory whose coupling constant 
is a s (Q 2 ). 

In the previous section, we computed the coefficients of operators that 
contribute to the vacuum expectation value of the product of currents by 
considering the various ways of contracting the quark fields in the product. 
Here, we should note that the operator 1 does not contribute to the Compton 
scattering amplitude. The leading contributions come from operators that can 
create and annihilate quarks in the proton wavefunction. 

The most important terms in the operator product of two currents J /J 
come from products of two quark currents qf^qf with quarks of the same 
flavor. Therefore we will begin by studying the OPE of the individual quark 
currents. To zeroth order in a s , the leading terms of the operator product of 
quark currents are given by 

cn^qix) qYq(O) 

(18.124) 

= q{x)^q{x)q(Q)Y g(0) + ^{x^qix^iO^qiO) + ■■■, 

where the contractions should be evaluated as Feynman propagators for the 
quark fields. The terms with explicit contractions are singular as x —»• 0; 
the remaining terms are nonsingular and thus less important in the short- 
distance limit. In the OPE of currents with quarks of different flavor, there 
are no corresponding singular terms; we will argue below that this conclusion 
is valid even beyond the leading order in a s . 

To evaluate W^ v , we must take the Fourier transform of the terms in 
(18.124), as indicated in (18.102). When we do this, we should remember that 
the propagators carry not only the Fourier transform momentum q but also 
whatever momentum is carried in through the quark fields. To take account 
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of this, it is convenient to represent 


I (18.125) 

where the derivatives d act to the right on the quark field. Notice that this 
contribution has the structure of the first diagram on the right in Fig. 18.10. 
Similarly, the second contraction indicated in (18.124) has the form of the 
second diagram in Fig. 18.10. 

In the short-distance limit, the momentum q will be larger than any ex¬ 
ternal momentum entering the quark fields. Thus we should expand 


1 _ “1 
(id + q) 2 Q 2 — 2 iq ■ d + d 2 




2 iq ■ d — 

w 



(18.126) 


We will argue below that the terms with d 2 in the numerator are unimpor¬ 
tant and may be dropped. However, we should retain all powers of the ratio 
(2 iq-d/Q 2 ). This ratio has Q 2 in the denominator and so is formally sup¬ 
pressed in the short-distance limit. However, in the parton model 


2iq-d 2 q-£P 
~Q 2 ~ ~* ~Q r ~ 


(18.127) 


so, eventually, all of these terms must be equally important. We will see how 
this works in a moment. 

The last step required to reduce the operator product (18.124) to a useful 
form is to reduce the product of Dirac matrices. We know from (18.113) 
that, after we average over the proton spin, W will be symmetric under 
the interchange of // and v. Thus, it does no harm to symmetrize the OPE. 
We can then reduce the product of three Dirac matrices to one by using the 
identity 


i ( 7 /*y *y + Yi a Y) = g m i v + i'g av - g^i a , (18.128) 


which is easily proved from the anticommutation relations. By the use of 
(18.126) and (18.128), we can rewrite (18.125) as 


"1 OO q . A 

-i-q(y‘(idn + - ig^0 + 7V + 7V - g? v 4) ^ E 

V „ =0 V 

(18.129) 

We can remove the term (i$)q, which vanishes to leading order in a s , since 
the quark field obeys the Dirac equation. To compute W\ and HV, we can 
also drop the terms with explicit factors of q 11 , since these will eventually be 
organized into the general form (18.113). Then, finally, (18.125) takes the form 

Id 4 x e n ' x q(x)^ 11 q(x)q(0)Yq(0) 

= -<9(27‘ , (i m - <r 9)T g (^®)\ 


( 18 . 130 ) 
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symmetrized under p. -H- v. 

The second term in (18.124) differs from the first by the interchange of the 
points x and 0 and the interchange of indices // and v. Its Fourier transform is 
thus given by (18.130) with the replacement q —1 —q. The complete operator 
product therefore contains only terms even in q. All remaining contributions 
from the singular terms of the operator product contain the operator 

q^ 1 (id^ 2 ) ■ ■ ■ (id^)q, (18.131) 


with an even number of indices, with these indices either identified with g, or 
v or contracted with powers of q. To write the relevant terms of the operator 
product expansion, we will modify this operator in two ways. First, since the 
operator in (18.131) has n vector indices, it contains components that trans¬ 
form under many different irrreducible representations of the Lorentz group. 
Each component has a different rescaling law under renormalization. However, 
we will see below that only the component of (18.131) with the highest spin 
is relevant to our analysis. This component is obtained by totally symmetriz¬ 
ing the indices g i,... ,g n and then subtracting terms proportional to 
so that the operator is traceless on all pairs of indices. We will retain only 
this component when we write out the operator product of currents. Second, 
the operator (18.131) does not transform simply under gauge transformations. 
Since the original currents J ,J were invariant to color gauge transformations, 
the operator product of two currents must be a sum of gauge-invariant oper¬ 
ators. We can make (18.131) gauge-invariant by replacing each factor of (id 11 ) 
with a covariant derivative (iD 1 '). This modification adds only terms propor¬ 
tional to the strong coupling constant g, so it has no effect on our derivation 
of the operator product coefficients. 

Incorporating these changes, let us define a spin-n operator with quarks 
of flavor / as follows: 

0 (n)K!-nn =q f7 {^(iD^)---(iD^)q f - traces, (18.132) 

with indices symmetrized and with appropriate subtractions. We can use these 
operators to write a final expression for the most singular part of the OPE 
of two currents J 1 '. The leading terms in this operator product come from 
(18.130) and the corresponding contraction with q — q. Extracting the 

pieces of these expressions that contain the highest spin operators (18.132), 
we find 


ij /"(./•)./"(0) 


- 4 

f 1 «=2 

oo 

-<r £ 

n =2 


(2 q 1 '' 1 ) • • • (2ff ; ' n 2 ) ^(n) u 

n —1 U f 


m r 

(Q 2 Y 


(18.133) 


u f 


+ 
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where the sums over n run over even integers only. 

Expression (18.133) has been derived in the leading order in a s . Higher- 
order Feynman diagrams will contribute corrections to the coefficient functions 
of order a s (Q 2 ). These corrections will be important only if they are multiplied 
by large logarithms. If we consider the operators appearing on the right- 
hand side to be normalized at the renormalization scale Q, there is no large 
ratio of momenta available to enhance the QCD corrections to the coefficient 
functions. Large logarithmic corrections may still arise at a later stage of the 
calculation, when we compute the matrix elements of the operators 0^ n \ 
From the expansion (18.133), it is straightforward to compute an expan¬ 
sion for by taking its expectation value in the proton state. To carry out 
this computation, we need to know the proton matrix elements of the opera¬ 
tors O^. Notice that these matrix elements cannot depend on the direction 
of the momentum q 11 , since that dependence has been isolated in the coeffi¬ 
cient functions. This means that only the proton momentum P tl is available 
to carry the vector indices of the matrix element. We can therefore write the 
spin-averaged matrix element of O as 

<P| 0 { f n)l ' i'"''" |P) = A'} ■ 2P" 1 • • • P"" - traces. (18.134) 

The coefficients A'j are dimensionless. They are not quite pure numbers, be¬ 
cause they depend on the renormalization scale of the operators, but we will 
treat them as constants in the next few paragraphs. 

For the case n = 1, the operators O^ reduce simply to the quark flavor 
currents qj ,l q; in this case the operators are normalized independently of any 
scale and the coefficients A are truly constants. From our general discussion 
of form factors in Section 6.2, we know that the proton matrix element of a 
conserved flavor current at zero momentum transfer is given by 

(P\-q f Y l q f IP) = u{P)Y‘u{P)Ffi (0), (18.135) 

where P/i(0) is equal to the value of the corresponding conserved charge in 
the proton state. For the quark currents, this charge is just the number of 
quarks (minus antiquarks) of flavor / in the state |P), which we will call Nf. 
Averaging (18.135) over the proton spin, we find 

{P\q f Yqf\P) = 2P»-N f . (18.136) 

Thus, for n = 1, 

• >} \/ {j ■£ = “’ . (18.137) 

Similarly, Of is the contribution of the quark flavor / to the energy- 
momentum tensor of QCD: 

(T^) f =q f ^(iD^)q f . 


(18.138) 
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Thus, Aj is the fraction of the total energy-momentum of the proton that is 
carried by the quark flavor /. 

When we evaluate the series for using (18.133) and the expression 
(18.134) for the operator matrix elements, we find 




f 


; J2 pl " p ‘ 


■ (2g-P)"-3 

( Q 2 )"- 1 ' f 9 


/iv 




C2q-Py 

(Q 2 r 


+ 


(18.139) 

where the sums over n run over even integers from 2 to infinity. In addition 
to the corrections to the OPE omitted in (18.133), we have also dropped 
contributions from the trace terms in (18.134). This is quite appropriate: In 
each of these terms, two factors of the proton momentum P a P 8 are replaced 
by g al3 m'p , were m' 2 = P 2 is the proton mass. When the indices are contracted 
with powers of q , we obtain a term of order 


m 2 Q 2 <C (‘2q ■ P) 2 . (18.140) 

Since {Q 2 /2P ■ q) = x , which is held fixed in deep inelastic scattering as Q 2 
becomes large, the contribution from the trace terms is suppressed by a factor 
m 2 /Q 2 , times powers of x. 

In general, an operator of dimension d has a coefficient function in the 
operator product expansion of currents that has dimension (mass) 6-6 *; in the 
Fourier transform of the OPE, this coefficient function will carry a suppression 
factor 

/ \ \d —2 

(q) ■ (18141 > 

However, if the operator has spin s, the operator matrix element will con¬ 
tribute s factors of the vector P 1 '. so that, in the kinematic region of deep 
inelastic scattering, the contribution will be of order 


/2 P ■ qy / 1 W-s-2 

\~cr~) \q) 


(18.142) 


Thus, the relative size of contributions from the OPE to deep inelastic scat¬ 
tering is controlled, not exactly by the dimension of the operator, but rather 
by the twist , defined as 

t = d — s. (18.143) 


In our selection of the leading terms in the operator product expansion of 
currents, we have consistently kept the contribution of leading spin for each 
dimension or for each power of Q -1 in the coefficient. The operators C^" 1 all 
have twist t = 2, which is the smallest possible value for QCD operators other 
than the operator 1. 

In the operator product of two different flavor currents—for example, 
and d^d —the leading terms in the OPE have the quark structure 
( uTudrd ,) and thus have twist t > 4. Thus, to all orders in a s , the cross terms 
in the operator product of currents are suppressed by at least a factor 
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(1 /Q 2 ) relative to the leading-twist terms presented in (18.133). If we neglect 
these suppressed terms, the expression for separates, to all orders, into 
a sum of contributions 

=Y^Q)W^, (18.144) 

/ 


where W^ v is the matrix element of two quark flavor currents 

We can read from (18.139) the following expressions for W\ and W>: 


TI- ' p y n An 

’V" 

8 (2q ■ P) n ~ 2 


^ = E^/E 


(18.145) 


q 2 m 


n—2 


V’ 


where the sum over n in each line runs over even integers from 2 to infinity. 
Like (18.139), these expressions explicitly separate according to (18.144). It 
is noteworthy that the series (18.145) satisfy the Callan-Gross relation in the 
form (18.121), without further parton model input. However, this relation 
is corrected in order a s due to the next-order contributions to the operator 
product coefficients. 

Because the leading contributions to the deep inelastic form factors can 
be written as sums over quark flavors, it is tempting to reverse the logic of Eq. 
(18.120) and use these equations to define the parton distribution functions. 
In particular, let us define 

= |^ImIE 2/ (x,Q 2 ), (18.146) 

where By is the second form factor of , defined in (18.144), neglecting 
terms suppressed by powers of Q 2 . In the parton model evaluation, 

//(*) = ff(x) + (18.147) 

From (18.123) and the definition (18.146), we know that f^{x) enters in the 
correct way into the formula for the deep inelastic scattering cross section. 
However, parton distribution functions have other important properties, in¬ 
cluding the normalization conditions (17.36) and (17.39) and the evolution 
with Q 2 discussed in Section 17.6. We must now see whether we can derive 
these properties from (18.146) using the operator product expansion. 


The Dispersion Integral 

The operator product analysis has given us explicit expressions for W\ and 
W -2 as a series in inverse powers of Q 2 . In the following discussion, we will 
concentrate on the analysis of Wo • We must work out the relation of its se¬ 
ries expansion to the observable deep inelastic scattering cross section. As 
in the discussion of Section 18.4, the OPE analysis naturally takes place in 
an unphysical kinematic region. To make the operator product expansion, we 
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Figure 18.11. Analytic singularities of W 2 (v,Q 2 ) in tlie complex v plane, 
for fixed Q 2 . 

needed to consider Q 2 to be larger than any other kinematic invariant. How¬ 
ever, in the physical region for deep inelastic scattering, 2 P-q > Q' 2 . We need 
a formula that connects these two distinct regions. 

To state this problem more precisely, define 

v = 2 P-q = ys\ (18.148) 

in the frame in which the proton is at rest, v = 2 m p q°. The form factor 
Wo can be viewed as a function of v and Q 2 . Then, for fixed Q 2 , the OPE 
gives a series expansion about the point v = 0, while the physical region for 
deep inelastic scattering is v > Q 2 . Because this region is associated with 
a physical scattering process, HbfgQ 2 ), viewed as an analytic function of 
v for fixed Q 2 , will have a branch cut along the real v axis in this region. 
The discontinuity across this branch cut will be (2 i) times the imaginary part 
of Wo, which appears in the expression (18.123) for the deep inelastic cross 
section. Because expression (18.102) is symmetric under the interchange of 
(</,//) and (—q, v), Wo must obey 

W 2 (—i/, Q 2 ) = W- 2 {v,Q 2 ). (18.149) 

Thus, Wo must also have a branch cut along the negative real axis, from 
v = — Q 2 to —oo. The discontinuity across this cut gives the cross section for 
the u -channel process in which positive energy comes in through the second 
current and out through the first. Since q 2 = —Q 2 < 0, there is no possible 
physical t-channel process; thus Wo has no further singularities in the complex 
v plane. The analytic structure of W 2 {v, Q 2 ) is shown in Fig. 18.11. 

Now consider the contour integral 

(l8 ' l50) 

for n even, taken on a small circle surrounding the origin. This integral picks 
out the coefficient of v n ~ 2 in the series expansion for Wo. The OPE formula 
(18.145) gives us the leading contribution to this coefficient for large Q 2 : 

^. = EO?(o4vr^- 


( 18 . 151 ) 
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Figure 18.12. Contour of integration involved in the derivation of the mo¬ 
ment sum rules for Wo. 

The corrections to this formula are of order a s (Q 2 ), from the evaluation of 
the OPE coefficient functions. 

On the other hand, we can also distort the contour as shown in Fig. 18.12 
and evaluate it as an integral over the discontinuities of Wo. By the symmetry 
(18.149), the two branch cuts give equal contributions. Thus, 

OO 

In =2 j ^- i -^ ZT (2i)lmW 2 (iy,Q' 2 ). (18.152) 

Q 2 

Now change variables to x = Q 2 /v. The integral becomes 

l 

4 = Wr~ / ixx ^-h ImW ’- |18 - 153) 

o 

When we equate (18.151) and (18.153) and relate Im Wo to the parton distri¬ 
butions (x) using (18.146), the relation we have derived splits into a series 
of sum rules, 

l 

jdxx n ~ 1 f^'{x, Q 2 ) = A™, (18.154) 

o 

for n even. These relations are known as the moment sum rules for the deep 
inelastic form factors. They relate the x moments of the parton distribution 
functions, as defined by Eq. (18.146), to the proton matrix elements of twist-2 
operators. 

Because lib is a symmetric function of v, the moment sum rules apply 
only for even n. However, in deep inelastic neutrino scattering, there is a third 
form factor in W''", associated with the interference term between the vector 
and axial vector parts of the weak interaction current. In Problem 18.2, we 
show that this form factor can be used to derive a set of sum rules for odd n: 

l 

Jdxx n - 1 fJ(x,Q 2 )=A}, 

0 


(18.155) 
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where A'j- is the coefficient of the proton matrix element (18.134) for odd n, 
and fj{x) is a form factor which, in the parton model, evaluates to 

fj{x) = ff(x) ~ ff(x). (18.156) 

Combining this information with the argument given below (18.136), we 
can see that the definition of the parton distribution functions from the deep 
inelastic form factors has the correct normalization. Using (18.137), we find 

l 

Jdx f~(x) = N f , (18.157) 

o 

the (net) number of quarks of flavor / in the proton. Similarly, (18.154) and 
(18.138) imply 

l 

jdxxf + (x ) = (x ), (18.158) 

o 

where (x) ^ is the fraction of the total energy-momentum of the proton carried 
by quarks and antiquarks of flavor /. 


Operator Rescaling 


If the coefficients A^ were truly constants, relations (18.154) and (18.155) 
would be consistent with parton distribution functions that satisfy Bjorken 
scaling. However, as we remarked below (18.134), these factors actually de¬ 
pend on Q 2 , since this is the normalization point of the operators in the oper¬ 
ator product expansion (18.133). Since this dependence comes only through 
operator rescaling, it involves only logarithms of Q 2 , and so contributes only 
a slow violation of Bjorken scaling. We can work out the Q 2 dependence of 
the parton distribution functions quantitatively by summing the leading log¬ 
arithmic corrections to the matrix elements of the twist -2 operators. 

To account for these corrections, let us first assume (incorrectly, as we will 
see below) that the twist-2 operators (18.132) are renormalized without oper¬ 
ator mixing. Then the leading logarithmic corrections to the matrix element 
of the operator O^ would be summed by rescaling the operator normalized 
at Q to operators normalized at a standard reference point fi, of order 1 GeV. 
The relation between these conventions would be 


Io m, = / i°E«my yv* 

f iQ \\og{fl 2 /A' 2 ) J 



(18.159) 


where cdj is the first coefficient of the 7 function of 0^ n \ Then the factors AJ 
would depend on Q 2 according to 


A](Q 2 ) 


( log(Q7A 2 ) y/ /26 ° 

V log(^ 2 /A 2 ) / 


A^ 2 ). 


(18.160) 



636 Chapter 18 Operator Products and Effective Vertices 


Figure 18.13. Diagrams contributing the anomalous dimension of the quark 
twist-2 operators. 

This equation agrees with the scale dependence of operator product coeffi¬ 
cients written in (18.76), for the special case of an operator product of cur¬ 
rents, cii = a 2 = 0. To find the explicit form of the rescaling factor, we must 
compute cij. 

To compute the 7 functions of the quark twist-2 operators, we must com¬ 
pute their counterterms for operator rescaling. These are determined by the 
diagrams shown in Fig. 18.13. It suffices to compute these diagrams with ex¬ 
ternal momentum p entering through the quark line and zero external momen¬ 
tum injected into the operator. Under these conditions, the matrix element of 
the operator 0^ n \ in leading order, equals 

= 7 ''V ' 2 ■■■p IUn - (18.161) 


Here and at all later points in the discussion, we will treat the matrix elements 
of O y,"' as though they are symmetrized in the n indices and have all possible 
traces subtracted. We must now evaluate the diagrams of Fig. 18.13 and collect 
all terms that rescale this structure. 

The first diagram of Fig. 18.13 is quite straightforward to evaluate: 



—i 

(k — p) 2 



d 4 k 1 
(2t r) 4 (k 2 ) 2 (k-p) 2 


We combine denominators using identity (6.40): 


(18.162) 


1 

(k 2 ) 2 (k — p) 2 



2 ( 1 -*) 

(k 2 - A) 3 ’ 


(18.163) 


the quantities in the denominator on the right are k = k — xp and A = 
—x(l — x)p 2 . We must now shift the integral, substitute k = k + xp in the 
numerator, and pick out a term proportional to (n — 1) powers of p. If this 
term contains the factor , we may drop it, since it contributes to the 
coefficient of an operator of higher twist and since, in any event, it will be 
removed when we subtract traces. Thus, we must choose carefully which two 
factors of k we replace with k when we replace the others with (xp). The 
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following choices, simplified using the rotational symmetry of the k integral, 
do not give useful contributions: 


k /n k /o = -k'V 1 *"' • 

-> i k W 1 = ^kV 1 "*. 


(18.164) 


In the second line, we have used the symmetry under o pj. The one 
remaining placement of the factors of k is 


ik^V'T^ = —fcV 1 . 


(18.165) 


Thus (18.162) has the value 


= —ig 


1 

ig 2 ^ J dx • 2(1 - x) j 


d 4 k 


(2tt) 4 (k 2 - A) 3 

0 

1 

^ J dx( 1 - re)*"- 1 ^r(2-f) 7 ^^ • • -F 


yn yyy ...yyy 


4 2 


3 n(n + 1) (47 t) : 


r r(2-4)7^y" 2 •••p"". 


(18.166) 


It is not so obvious that there are additional contributions to the rescaling 
of the operators . Note, however, that the covariant derivatives in (18.132) 
contain explicit factors of the gauge field, 


//l" = /(F u.X" l n". 


(18.167) 


and these may be contracted with gauge field vertices on the external legs. 
These contributions give rise to the second and third diagrams in Figure 18.13. 
The term in which two factors of A nfJ - from (18.167) are contracted with one 
another is proportional to G^ ^ and thus does not contribute to the rescaling 
of the leading-twist operators. 

The contributions we have just described have the form of sums over j, 
where pj is the index of the derivative that includes the contraction. Then 
the second diagram of Fig. 18.3 is the sum over j of the following integral: 




= ig-C-2 


(r) / 


(-gt a g X ^)p^+ 1 ■■■p^ n 
d 4 k 1 


—i 


(27t) 4 k' 2 (k — p) : 


(k — p) 2 

j Pi tfY'k 112 ■ ■ • k" //' ' • • ■ p ,,n . 

(18.168) 
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Since fij and p,\ are symmetrized, we can use (18.128) to rewrite 


jtoV’f 1 -)■ k^Y 1 +7 "'F 1 

-> 2y 111 , 


(18.169) 


where, in the second line, the symmetrization of indices and subtraction of 
traces is understood. Now combine denominators. To obtain a term with 
(n— 1 ) factors of p, we must replace every factor k in the numerator of (18.168) 
with (xp). This gives 


= iff ^ j dx J jrryr (k -j 'a)- 2 ' m k 1 ' 1 '"' 5 ''' ( x P IXi )p N+1 • ■ ■p* 1 "' 

0 

1 

= i 5 2 | J dx x j - 1 -^-^T{2-^)Y 1 Y 2 ■ ■■Y" 

0 

= (18-170) 

This contribution must be summed over j from 2 to n. The third diagram of 
Fig. 18.13 makes an equal contribution. 

Summing the rescaling factors from the three diagrams of Fig. 18.13, we 
find for the operator rescaling counterterm of 


s f = ___ 1 r(2 7 ) 

(47r) 2 3 L ■“ j n(n + 1) J (M' 2 ) 2 ~ d / 2 

J — ^ 


(18.171) 


From this result, we can derive the Callan-Symanzik 7 function by the use of 
(18.23) and the field strength renormalization counterterm (18.9). We find 


7/ 


3 (4tt) : 


n 1 

7- 


i(n + 1 ) 


(18.172) 


Notice that this expression vanishes for n = 1, so that there is no rescaling of 
Aj, as required by (18.157). For n > 1, 7)? is positive and so its coefficient cij 
is negative. This implies that the higher moments of the quark distribution 
functions are suppressed as Q 2 becomes large. 


Operator Mixing 

The QCD rescaling of the operators is still more complicated because 
QCD contains additional twist-2 operators which can be built from gluon 
fields. These new operators are mixed with the quark twist-2 operators by the 
diagrams of Fig. 18.14. 

For n even, the diagrams of Fig. 18.14 give the operators matrix 
elements in the state of a gluon with momentum p. The tensor structure of 
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Figure 18.14. Diagrams that produce operator mixing between twist-2 
quark and gluon operators. 

this matrix element contains the term 




(18.173) 


where a , (3 are the polarization indices of the external gluons. This structure 
arises from the operator 

cT/’ = - traces, (18.174) 

y 2 

symmetrized on pi, • • • //„, with traces subtracted. These operators have di¬ 
mension (n + 2) and spin n, and thus have twist 2. 

The gluon operators (18.174) are relevant only for n even. Using the ma¬ 
nipulation 


F™”(iD™) ■ ■ ■ F™ v = id™(F™ 1 ' ■■■F™,,) - (iD™)F™ v ■ ■ ■ F™ v , (18.175) 


we can transfer the covariant derivatives from one factor of F^ v to the other, 
giving 

of = (-1 ) n oW + d™ (O'). (18.176) 

Thus, for n odd, the operator Og^ is equal to a total derivative. The matrix 
elements of a total derivative are proportional to the momentum injected into 
this operator. Since zero momentum is injected in the calculation of the proton 
matrix elements of the OPE of currents, the operators have no effect on 
the deep inelastic scattering cross section for n odd. 

For n even, however, we must take account of the mixing of with 
(D^ n \ The computation of the diagrams of Fig. 18.14 is quite similar to the 
other operator rescaling calculations we have done in this chapter, and so we 
reserve working out the details for Problem 18.3. We find that the diagrams 
of Fig. 18.14 contain a structure proportional to (18.173) with the coefficient 


2(n 2 + n + 2) g 2 d 

n(n + 1 )(n + 2) (47r) 2 2 


(18.177) 


From this computation, we find that the renormalized twist-2 quark operator, 
properly normalized at the scale M, is given in terms of bare operators by 


[oW] M = (l+i,)[O<”>] 0 + fe)[O«] 0 , 


( 18 . 178 ) 
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Figure 18.15. Diagrams contributing to the operator rescaling of twist-2 
gluon operators: (a) contributions to gluon-quark mixing; (b) contributions 
to diagonal gluon operator renormalization. 


where Sf is given by (18.171) and 

r 2 (n 2 + n + 2 ) T( 2 -f) 

9 ( 47 t ) 2 n(n + 11 (// -f 2 ) (M 2 ) 2_d / 2 


(18.179) 


This equation gives us two elements of the anomalous dimension matrix of 
twist -2 operators. 

The remaining elements of the 7 matrix for twist-2 operators are gener¬ 
ated by the diagrams shown in Fig. 18.15. The diagram of Fig. 18.15(a) gives 
the mixing of back into 0^ n \ The diagrams of Fig. 18.15(b), combined 
with the counterterm S 3 for gluon field strength rescaling, gives the diagonal 
anomalous dimension. The counterterm <^3 is given explicitly, in Feynman- 
’t Hooft gauge, in (16.74). The remainder of this anomalous dimension com¬ 
putation is discussed in Problem 18.3. 

To describe the complete anomalous dimension matrix, we begin by con¬ 
sidering a strong interaction model with one quark flavor. In this case, there 
is one twist-two operator which mixes with Og n \ These two operators 
mix through a 2 x 2 matrix 


where 


a 


n 

ff 


a 


n 
f 9 


a 


n 

9f 


a 


n 

99 


7 


n 


9 2 

( a ]f 

n n 

a fg 

(47t) 2 

^ a gf 

a n 

a 99 


8 r v 1 2 1 

3T + Y i n(n + 1) J 5 

( n 2 + n + 2 

4- 

n(n + l)(n + 2) ’ 

16 n 2 + n + 2 

3 n(n 2 — 1) 

_ 6 [I+2 +4 y I__i_1_1 

U 9 f 4 “ i n(n - 1) (n + 1 )(n + 2) J' 


(18.180) 


(18.181) 
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Notice that this matrix is not symmetric. In the last line, nf is the number 
of quark flavors, equal to 1 in this case; this term comes from (16.74). 

In the realistic case, QCD contains several quark flavors— u, d, s, and also 
c and b when we work at momenta sufficiently large that we can ignore the 
masses of these particles. Then the anomalous dimension matrix 7 ” has size 
(nf + 1) x (rif + 1). The submatrix acting on quark operators is diagonal, 
with all of the diagonal entries being given by aff in (18.181). The quark- 
gluon and gluon-quark entries are all given by cij g and a' gf , respectively, and 
are independent of the flavor. The gluon diagonal entry is given by in 
(18.181) with the realistic value of iif. This means that the gluon operator 
mixes with only one linear combination of quark operators: 

Y,°f n) '’ (18.182) 

/ 


the orthogonal linear combinations are simply rescaled, with the exponent 
given by or (18.172). 

Let us now apply this analysis of operator mixing to the evaluation of the 
moment sum rules. For odd n, there is no operator mixing, and so the Q 2 
dependence of the right-hand side of (18.155) is correctly given by the simple 
rescaling (18.160). 

For even n, we must take operator mixing into account. The right-hand 
side of the sum rule (18.154) is the proton matrix element of a twist-2 operator 
normalized at the scale Q. Let us write an arbitrary linear combination of these 
operators as 

Q . (18.183) 


where the index i runs over g and the various flavors /. To rescale this operator 
to a fixed reference momentum //, we rewrite the coefficients in a basis of left 
eigenvectors of 7 ” and rescale each eigenvector acccording to (18.159). In 
terms of the matrix of rescaling coefficients, we can write the rescaling 
abstractly as 



f/log(Q 2 /A 2 r 

a n /2&o n 

l Viogf/^/A 2 ), 

) / 


[oJ n) ] 


(18.184) 


This rescaling, acting with cf to the left of the matrix (a”), is precisely the 
prescription required by Eq. (18.79). 

Let us work this out explicitly for the case n = 2. The right-hand side 
of the moment sum rule (18.154) is given by the matrix element of We 
rewrite this as 


Q (2 ) — 
u f 




(18.185) 


The first term is simply rescaled; the second term mixes with the gluon oper¬ 
ator Og 2 \ The anomalous dimension matrix acting on (Yf,f Of,O g ) for n = 2 
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has coefficients 

(° // “/ 9 n f\ = (~f 
' a gf a h ' ' T 

The left eigenvectors of this matrix, and their 


K lf ) ■ < 18 - 186 ) 

- 3 «// 

corresponding eigenvalues, are 


( 1 , 1 ) 

i 6 

(y-'v) 


-> 

-> 


a 2 = 0 


4/16 


(IH- 


(18.187) 


Notice that the first eigenvector gives a linear combination of operators c|o) 2) 
with zero anomalous dimension. This operator is in fact the total energy 
momentum tensor of QCD, 


(18.188) 

/ 


which must have 7 = 0. If we expand the second term in (18.185) in terms 
of the components (18.187), we can compute the full form of the operator 
rescaling. We find 


[°/ !1 ]q = 


1 


16/3 + iif 

1 nog(Q 2 /A : 2 ) \- { ^ +nf)/2b0 


+ n f (f + iif) \ log(/x 2 /A 2 ) ) 
/ log(Q 2 /A 2 ) ^~ 32/36 ° 


■16 *- 

.T y 


o 


( 2 ) 


+ 


Vlog(p 2 /A 2 ) 


0 ( f 2) - — V o (2) 
L f n f ^ f 



(18.189) 


where T is the energy-momentum tensor (18.188). The right-hand side of 
the n = 2 moment sum rule is given by the coefficient of the proton matrix 
element of this operator. To evaluate this coefficient, we need to define gluon 
analogues of the A’/, by writing, analogously to (18.134), 


(P | |p) = A™ ■ 2 P* 1 ■ ■ ■ Pi*» - traces. (18.190) 

For the case n = 2, we note in particular that 


(P | T''" |P) = 2 P tl P v - (18.191) 

thus, (18.188) implies 

Y A }+ A l = 1- (18.192) 

/ 


If we replace each operator in (18.189) by the corresponding coefficient A?(p,), 
we will have an expression for the right-hand side of the n = 2 moment sum 
rule which makes its Q 2 dependence explicit. 

Although expression (18.189) is rather complicated, it has a simple form in 
the extreme limit Q 2 —> 00 . At asymptotic Q 2 , the last two terms of (18.189) 
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Figure 18.16. Fractions of the total energy-momentum of the proton carried 
by various parton species, as a function of Q, according to the CTEQ fit to 
deep inelastic scattering data described in Fig. 17.6. The Q dependence of 
the curves is calculated from the QCD evolution equations. 

tend to zero, and the right-hand side of (18.189) becomes a fixed number 
times the energy-momentum tensor. Then, using (18.191), we can evaluate 
the n = 2 moment sum rule completely: 

l 

/ lfcl // +W ->■ 16 / 3 + ,,,' (18193) 

0 

In this extreme limit, we find that each quark flavor carries the same fixed 
fraction of the energy-momentum of the proton. By (18.192), the remainder 
is carried by the gluons. To illustrate, in a theory with nf = 4, each quark 
flavor carries 3/28 of the total momentum of the proton, and the gluons carry 
the remaining 4/7. Figure 18.16 shows how slowly these asymptotic results 
are approached starting from realistic parton distributions. 

Relation to the Altarelli-Parisi Equations 

The operator mixing analysis just described gives predictions for the moments 
of parton distributions which imply that these integrals are Q 2 dependent. Of 
the various moment integrals that do not involve operator mixing, only the 
n = 1 integrals which give the flavor quantum numbers of the proton are 
constant as a function of Q 2 . The rest decrease as powers of log Q 2 . Similarly, 
one linear combination of the matrix elements of n = 2 twist-2 operators 
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remains constant with Q 2 . This relation is expression by the sum rule (18.192). 
To write this relation more clearly, let us introduce the parton distribution of 
gluons as a smooth function satisfying the relations 

l 

I dxx^fgM 2 ) = A { g n] (Q 2 ). (18.194) 

0 

Then (18.192) becomes just the total momentum sum rule for parton distri¬ 
butions (17.39): 

1 

[dxx^J2 //( ;c ) + fa( x )] = !• (18.195) 

of 

It is not difficult to verify that, for n > 2, all of the eigenvalues of the 
matrix a'f of anomalous dimension coefficients are negative. Thus, all of the 
higher moment sum rules decrease, subject to the flavor charge and momen¬ 
tum conservation laws. In other words, the operator renormalization analysis 
predicts that parton distributions shift down to smaller values of x as log Q 2 
increases. It is pleasing that this is the same conclusion that we reached in 
Section 17.5, where we derived the Altarelli-Parisi equations to describe this 
evolution of the parton distributions. 

Given that the operator analysis and the Altarelli-Parisi equations im¬ 
ply the same qualitative behavior for the parton distributions, how do these 
analyses compare quantitatively? To compare them directly, we should work 
out what predictions the Altarelli-Parisi equations make for the moments 
of the parton distribution functions. Let us begin with the simpler case of 

fj( x ) = ff ( x ) - ff( x )- 

To find the Altarelli-Parisi equation for this quantity, subtract the last 
two equations of (17.128). The term involving the gluon distribution cancels, 
and we find 




a,m 

2tt 




(18.196) 


Now define 


M fn = J dxx n 1 f f (x). 


(18.197) 


0 

This quantity obeys the differential equation 

l l 


d -M7 = aAQ '~ ] 


cl log Q 


■2 lv± fn 


■2ir 


Jdxx n 1 J (“)• (18.198) 
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Interchange the order of integration on the right-hand side, and change vari¬ 
ables to y = x/z: 


1 1 z 


0 a: 0 0 

1 1 

= / - J dm," V 


i i 

jdz 2"- 1 JC 


(18.199) 


Then the right-hand side of the differential equation neatly factorizes: 


d = a s (Q 2 ) 

d\ogQ 2 nf 2t r 


1 1 

j dz z"- 1 Pq^q(z) ■ J dy y"- 1 (y); (18.200) 


the last factor is again M fn . The coefficient in this relation is the nth moment 
of the splitting function Pq^q( z )- We can compute this from the explicit form 
of this function given in (17.129): 

l l 

Jdzz^^q^qiz) = jdzz n_1 ^ ^ ~ +j|(5(l-z) . (18.201) 

0 0 

The integral over the distribution is done by using the definition (17.105): 


/ dzz n ~ 1 -— l —— = I dz 

J (1-^)+ J 


z"- 1 - 1 


i 

=/*(- 

0 

n —1 1 

= -Ev 


(18.202) 


f , n -i „ , x 4 ri-i i ^ i 3 

Jdzz Pq^q(z)=- 3p- + E--2 
0 1 1 

2 r ^ i 2 

—1+4^--— 

A z -* 1 71 171 - 


j n(n + 1) 


(18.203) 
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Remarkably, this is just a'j / 4, as the anomalous dimension coefficient is given 
in (18.172) or (18.181). Thus, according to the Altarelli-Parisi equations, the 
nth moment of fj{x) obeys 


d , _ a s (Q 2 ) 

d log f n ~ ' 


87r 


M 


fn' 


(18.204) 


To integrate this equation, we need the explicit form of a s (Q 2 ). Inserting 
expression (17.17), we find 


d 

dlogQ' 2 


M 7n = 


a n f 1 

-J. _ M~ 

2//,, logit/' A 7 


(18.205) 


The solution of this equation, derived from the Altarelli-Parisi equations, is 
precisely the function (18.160) that we derived from the operator analysis of 
the moment sum rules for fj. 

It is not difficult to check that this conclusion is more general. By taking 
the nth moment of the full Altarelli-Parisi equations (17.128), we convert 
these equations to a set of ordinary differential equations for the moments. 
The linear combination of quark distribution functions 


£(//(*) + //(*)) (18.206) 
/ 

couples to the gluon distribution and leads to a 2 x 2 set of equations. All or¬ 
thogonal linear combinations separate from the gluon distribution and thus 
have moments that obey equations identical to (18.205). To analyze the cou¬ 
pled equations, define 


1 

M n = [ dxx”- 1 ^2{f f (x) + 
J » 


1 

M gn = jdxx n ~ 1 fg(x). (18.207) 
0 


Then one can show, by the manipulations that led to (18.205), that the 
Altarelli-Parisi equations predict for these moments the set of coupled equa¬ 
tions 


--- M + 

d\ogQ 2 n 

d 


d log Q 


2 ^dgn 


1 


1 


2&0 log(<2 2 /A 2 ) 
1 1 
2&o log(<2 2 /A 2 ) 


[a’} f M++a'} g M gn \, 
n /a ” f M++ a” g M gn }, 


(18.208) 


where the coefficients afj are proportional to the nth moments of the splitting 
functions given in (17.129) and (17.130). In all cases, one can see that these co¬ 
efficients agree precisely with the corresponding coefficients in (18.181). Thus, 
the solution of these equations gives the same Q 2 dependence for the moments 
of parton distribution functions that we found from the operator analysis. 

Remarkably, the analysis of parton splitting functions given in Chapter 17 
and the analysis of operator renormalization factors given above have turned 
out to be two views of the same basic phenomenon. Both sets of equations 
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express the manner in which the constituents of hadrons in QCD are resolved, 
layer by layer, by hard-scattering processes at successively higher values of the 
momentum transfer. Our understanding that a quark, when studied on a fine 
scale, is resolved into a set of quarks, antiquarks, and gluons indicates that we 
have gone far beyond the simple notions of one-particle relativistic mechanics. 
Our two complementary derivations of this idea reinforce its fundamental 
character as a prediction of quantum field theory. It is especially pleasing 
that, as we saw at the end of Section 17.5, Nature apparently accepts this 
prediction and makes this consequence of quantum field theory an essential 
part of the structure of hadrons. 

Problems 

18.1 Matrix element for proton decay. Some advanced theories of particle inter¬ 
actions include heavy particles A' whose couplings violate the conservation of baryon 
number. Integrating out these particles produces an effective interaction that allows 
the proton to decay to a positron and a photon or a pion. This effective interaction is 
most easily written using the definite-lielicity components of the quark and electron 
fields: If ul, d l, ur, e r are two-component spinors, then this effective interaction is 

2 

Af — ~^abc^ ^ e Ra u Ra/3 U Lb~idLed• 

m X 

A typical value for the mass of the A' boson is mx = 10 16 GeV. 

(a) Estimate, in order of magnitude, the value of the proton lifetime if the proton 
is allowed to decay through this interaction. 

(b) Show that the three-quark operator in A£ lias an anomalous dimension 

7 = _ 4 J1. 

(4-7T) 2 

Estimate the enhancement of the proton decay rate due to the leading QCD 
corrections. 

18.2 Parity-violating deep inelastic form factor. In this problem, we first mo¬ 
tivate the presence of additional deep inelastic form factors that are proportional to 
differences of quark and antiquark distribution functions. Then we define these func¬ 
tions formally and work out their properties. 

(a) Analyze neutrino-proton scattering following the method used at the beginning 
of Section 18.5. Define 


J + = ^ (— 7 -)^ Jt = (— 7 —)«• 

= 2 i f d'.r, {P\ / | P) , 


Let 
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averaged over the proton spin. Show that the cross section for deep inelastic 
neutrino scattering can be computed from W according to 


d 2 a 

dxdy 


{yp —s- fi X) 


G 2 F y 

2 tt 2 


• Im[(I-X + - g^h ■ k 1 - i^k' a k 9 )W^(P,q)}. 


(b) Show that any term in proportional to q fl or q v gives zero when con¬ 

tracted with the lepton momentum tensor in the formula above. Thus we can 
expand w ith three scalar form factors, 

= -g^W^ ] + P»P v W% /) + ie^ Xa P x q a W^ +■■■, 

where the additional terms do not contribute to the deep inelastic cross section. 
Find the formula for the deep inelastic cross section in terms of the imaginary 
parts of W[ v) , W^\ and W^ ] . 

(c) Evaluate the form factors wj 1 ’' 1 in the parton model, and show that 

lxaW[ v] = 7T (f d (x) + /a(.r)), 

bn U .]'' = — x(f d (x) + /«(.»:)). 

ys 

Imffj" 1 = ^(/d(a')- /«(«))• 

Insert these expressions into the formula derived in part (b) and show that the 
result reproduces the first line of Eq. (17.35). 

(d) This analysis motivates the following definition: For a single quark flavor /, let 

J fL = h“(- 2 i)/. 

Define 

= 2* j d\ r , >•<■* <pi /{•/;,.(.<•:./;, : o)} \p). 

Decompose this tensor according to 

W fL = ~ 9 llV W lfL + P»P v W 2fL + ie^ Xa P x q a W 3fL + ■ ■ ■, 

where the remaining terms are proportional to q 11 or q v . Evaluate the Wjx in 
the parton model. Show that the quantities W\fL and WofL reproduce the 
expressions for W\f and Wo/ given by Eqs. (18.120) and (18.144), and that 
W 3 fL is given by 

O'TT 

lmW 3 fL = =-{ff(r) - 
y* 

(e) Compute the operator product of the currents in the expression for W^, and 
write the terms in this product that involve twist-2 operators. Show that the 
expressions for W 1 fp and W 3 fp that follow from this analysis reproduce the 
expressions for W 3 f and W 3 f given by Eqs. (18.144) and (18.145). Find the 
corresponding expression for W 3 fp. 
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(f) Define the parton distribution fj. by the relation 

fy(x,Q' 2 ) = ^lmW 3fL (x,Q 2 ). 

Show that, by virtue of this definition, the distribution function fj satisfies the 
sum rule (18.155) for odd n. 

18.3 Anomalous dimensions of gluon twist-2 operators. 

(a) Compute the divergent parts of the diagrams in Fig. 18.14, and use these to 
derive the second line of Eq. (18.181). Notice that this result holds only for n 
even. Show that the two diagrams cancel for n odd. 

(b) Compute the divergent parts of the diagrams in Fig. 18.5, and use these to derive 
the third and fourth lines of Eq. (18.181). 

18.4 Deep inelastic scattering from a photon. Consider the problem of deep- 
inelastic scattering of an electron from a photon. This process can actually be measured 
by analyzing the reaction e“*"e - —> e“*"e - + A' in the regime where the positron goes 
forward, with emission of a collinear photon, which then has a hard reaction with the 
electron. Let us analyze this process to leading order in QED and to leading-log order 
in QCD. To predict the photon structure functions, it is reasonable to integrate the 
renormalization group equations with the initial condition that the parton distribution 
for photons in the photon is S(x — 1) at Q 2 = (b GeV) 2 . Take A = 150 MeV. Assume 
for simplicity that there are four flavors of quarks, u, d, c, and s, with charges 2/3, 
— 1/3, 2/3, —1/3, respectively, and that it is always possible to ignore the masses of 
these quarks. 

(a) Use the Altarelli-Parisi equations to compute the parton distributions for quarks 
and antiquarks in the photon, to leading order in QED and to zeroth order in 
QCD. Compute also the probability that the photon remains a photon as a 
function of Q 2 . 

(b) Formulate the problem of computing the moments of Wo for the photon as a 
problem in operator mixing. Compute the relevant anomalous dimension matrix 
7 . You should be able to assemble this matrix from familiar ingredients without 
doing further Feynman diagram computations. 

(c) Compute the n = 2 moments of the photon structure functions as a function of 

Q 2 . 

(d) Describe qualitatively the evolution of the photon structure function as a func¬ 
tion of x and Q 2 . 




Chapter 19 


Perturbation Theory Anomalies 


In many examples, we have seen that loop corrections can have an important 
effect on the predictions of quantum field theory. We have studied examples in 
which the relative importance of operators is shifted by radiative corrections, 
and in which the form of the interactions they mediate is altered. However, in 
specific circumstances, radiative corrections can have an even more significant 
effect: They can destroy symmetries of the classical equations of motion. 

The most important effect of this type involves the chiral symmetries of 
theories with massless fermions. In Section 3.4, we saw that the massless Dirac 
Lagrangian has an enhanced symmetry associated with the separate number 
conservation of left- and right-handed fermions. This symmetry is generated 
by the axial vector current j'' 5 = Classically, 

d t ,r 5 = o (i9.i) 

for zero-mass fermions. This equation of motion is true not only in free fermion 
theory but also, as a classical field equation, in massless QED and QCD. 

However, in this chapter, we will see that the true picture is not so simple. 
We will show that, in gauge theories, the conservation of the axial vector 
current is actually incompatible with gauge invariance, and that radiative 
corrections in gauge theories supply a nonzero operator that appears on the 
right-hand side of Eq. (19.1). This new conservation equation for the axial 
current has a number of remarkable consequences, which we will discuss in 
Sections 19.3 and 19.4. 

19.1 The Axial Current in Two Dimensions 

Eventually, we will want to analyze the current conservation equation for the 
axial current in massless QCD. However, this discussion will involve some 
technical complication, so we will first study the physics that violates axial 
current conservation in a context in which the calculations are relatively sim¬ 
ple. A particularly simple model problem is that of two-dimensional massless 
QED. 

The Lagrangian of two-dimensional QED is 

£ = #W-i(iV) 2 , (19.2) 
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with //. v = 0,1 and D M = 
satisfy the Dirac algebra 


<9 #i + ir-.l,,. The Dirac matrices must be chosen to 


{y‘,7"} = 2«r. 


(19.3) 


In two dimensions, this set of relations can be represented by 2 x 2 matrices; 
we choose 


7 


o 




(19.4) 


The Dirac spinors will be two-component fields. 

The product of the Dirac matrices, which anticommutes with each of the 
7 m , is 


7° = 7°7 X 



(19.5) 


Then, just as in four dimensions, there are two possible currents, 

j 1 ' = ip^^ip, j ,,T ‘ = (19.6) 


and both are conserved if there is no mass term in the Lagrangian. 

To make the conservation laws quite explicit, we label the components of 
the fermion field ip in this spinor basis as 


= 



(19.7) 


The subscript indicates the y 5 eigenvalue. Then, using the explicit represen¬ 
tations (19.4) and (19.7), we can rewrite the fermionic part of (19.2) as 

£ = r*/(D„ + /b)c + + ibU{D 0 - D^ib-. (19.8) 


In the free theory, the field equation of ip + would be 


i{d 0 + d x )ip + = 0; 


(19.9) 


the solutions to this equation are waves that move to the right in the one¬ 
dimensional space at the speed of light. We will thus refer to the particles 
associated with ip + as right-moving fermions. The quanta associated with ip- 
are, similarly, left-moving. This distinction is analogous to the distinction be¬ 
tween left- and right-handed particles which gives the physical interpretation 
of 7 s in four dimensions. Since the Lagrangian (19.8) contains no terms that 
mix left- and right-moving fields, it seems obvious that the number currents 
for these fields are separately conserved. Thus, 

d,, (ipY = °> d v (W' = °- (19.10) 

It is a curious property of two-dimensional spacetime that the vector and 
axial vector fermionic currents are not independent of each other. Let e ,JM be 
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the totally antisymmetric symbol in two dimensions, with e 01 = +1. Then the 
two-dimensional Dirac matrices obey the identity 

7 " 7 5 = (19.11) 

The currents j ,t5 and j f ‘ have the same relation. Thus we can study the prop¬ 
erties of the axial vector current by using results that we have already derived 
for the vector current. 


Vacuum Polarization Diagrams 

In Section 7.5, we computed the lowest-order vacuum polarization of QED in 
dimensional regularization. In the limit of zero mass, we found, in Eq. (7.90), 


l 

= -i(q\r ~ tr[l] / dxx(\-x) { _ x ^ x) ^ y2 _ d /- 2 , 

0 

(19.12) 

where tr[l] = 4 gives the convention for tracing over Dirac matrices given in 
Eq. (7.88). If we set tr[l] = 2 to be consistent with (19.4) and then set d = 2 
in (19.12), we find the finite and well-defined result 


iU f,v (q) 


i(q 2 g f,v 

;(<r- 


</V) 

)- 

/ 7r 


q fx q v 


47T 

2 



(19.13) 


Notice that this expression has the structure of 
photon receives the mass 


m 2 


e 

7T 


a photon mass term; the 


(19.14) 


Schwinger showed that this result is exact, and that the photon of two- 
dimensional QED is a free massive boson.* In the discussion below Eq. (7.72), 
we pointed out that it is not possible for a vacuum polarization amplitude con¬ 
sistent with the Ward identity to generate a mass for the photon unless it also 
contains a pole at q 2 = 0. In two dimensions, such a pole can arise from the 
infrared behavior of the fermion-antifermion intermediate state, and we see 
this behavior explicitly in (19.13). 

Once we have an explicit expression for the vacuum polarization, we can 
find the expectation value of the current induced by a background electro¬ 
magnetic field. This quantity is generated by the diagram of Fig. 19.1, which 
gives 


l (fix)) = l -(i\V"(q))AM) = -(<T - ' ^A v (q), (19.15) 


*J. Schwinger, Phys. Rev. 128, 2425 (1962). 
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Figure 19.1. Computation of (j M ) in a background electromagnetic field. 

where A v (q) is the Fourier transform of the background field. This quantity 
manifestly satisfies the current conservation relation q^ (j^{q)) = 0. 

The identity (19.11) between the vector and axial vector currents allows 
us to derive from (19.15) the corresponding expectation value of j fl5 . We find 



If the axial vector current were conserved, this object would satisfy the Ward 
identity. Instead, we find 

q,(f j5 (q)) = ^“'q.AAq). (19.17) 

This is the Fourier transform of the field equation 

(19-18) 

Apparently, the axial vector current is not conserved in the presence of elec¬ 
tromagnetic fields, as the result of an anomalous behavior of its vacuum po¬ 
larization diagram. 

How could this happen? The Feynman diagrams formally satisfy the Ward 
identity both for the vector and for the axial vector current. The problem 
must come in the regularization of the vacuum polarization diagram. By di¬ 
mensional analysis, we know that this diagram has the form 

= ie 2 (Ag^ - B^y (19.19) 

The coefficient B is a finite integral, and is, in any event, unambiguously 
determined by the low-energy structure of the theory since it is the residue 
of the pole in q 2 . However, the integral A is logarithmically divergent, so its 
value depends on the regularization. Dimensional regularization automatically 
subtracts this integral to set A = B; then the vector current Ward identity 
is satisfied. But then we are led directly to (19.17). We could, alternatively, 
regularize the integral A so that A = 0. Working through the steps of the 
previous paragraph with this modification, we now find q fl (j^ 5 (q)) = 0, but 

q » {/(<?)) = - q v A v ( q ). (19.20) 

7T 

Though the result (19.17) is unpleasant, the result (19.20) would be a com¬ 
plete disaster, since it depends on the unphysical gauge degrees of freedom 
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of the vector potential. We conclude that it is not possible to regularize two- 
dimensional QED so that, simultaneously, the theory is gauge invariant and 
the axial vector current is conserved. The price of requiring gauge invariance 
is the anomalous nonconservation of the axial current shown in (19.18). 


The Axial Vector Current Operator Equation 

To understand what happened to the axial current from another viewpoint, 
we now study the operator equation for the divergence of j 7 ' 5 . Varying the 
Lagrangian (19.2), we find the following equations of motion for the fermion 
fields: 

$tl> = —ie^hj), dftXp"/ 11 = iex/zfi. (19.21) 


By using these equations of motion in the most straightforward way, it is easy 
to conclude that 3,,j 7 ' 5 = 0. However, a closer look at these manipulations 
reveals some subtleties, which alter the final conclusion. 

The axial vector current is a composite operator built out of fermion 
fields. In the previous chapter we saw that products of local operators are 
often singular, so we will define the current by placing the two fermion fields 
at distinct points separated by a distance e and then carefully taking the limit 
as the two fields approach each other. Explicitly, we define 


x+ti/2 


A V& — 


= symin lim 

s->0 


+ t)) 7^7 5 exp — ie J dz ■ A(z) ip{x — 4)j- (19.22) 


x — e/2 


Notice that, because we have placed xp and ip at different points, we must 
introduce a Wilson line (15.53) in order that the operator be locally gauge 
invariant. To give j ,ja the correct transformation properties under Lorentz 
transformations, the limit £40 should be taken symmetrically, 

symm lim 1 — 1=0, svmm lim { —— \ = —g^, (19.23) 

e—>0 1 C- J " e—yO l e~ J a 

with cl = 2 in this case. 

We now compute the divergence of the axial current defined as in (19.22): 

.t + c/2 

dftj^ 5 = symm lim j (8^xb(x + §))7 M 7 5 exp ie J dz ■ A(^)j xp(x — 4) 

x—e /2 
x-\-e/2 

+ xp(x + 4 ) 7 '‘ 7 5 exp -ie j dz ■ A(^)j (d fl xp(x - ■§)) 

x-e/2 

+ xp(x + fh'V I h .(j-)- f)|- 


(19.24) 
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Using the equations of motion (19.21), and keeping terms up to order e, we 
can reduce this to 

d^j 1 ' 0 = symm lim j ip(x + 4) \ieJ)L{x + 4) — iefi{x — 4) 
e— >0 i 

- iee v ^ l d ll A v {x)Yf 0 'tp{x - f)| 

= symm lim |+ |) [-ieY'e 1 '(d,jA v - d^A^)]~{ h ip(x - f)|. 

(19.25) 

Expression (19.25) seems to vanish in the limit e-iO. However, we must 
take account of the fact that the product of the fermion operators is singular. 
In two dimensions, the contraction of fermion fields is 

^log(// )') (19-26) 

= -i l a {y - z )a 

2tt (y - z ) 2 

Thus, 

i>(x + f)Ef(.T - 4) = • (19.27) 

Notice that the result (19.27) contains an extra minus sign from the inter¬ 
change of fermion operators. 

Because the contraction of fermion fields is singular as e —» 0, the terms 
of order e in the last line of (19.25) can give a finite contribution. Taking the 
contraction according to (19.27), we find 

= syminhm j tr ' {~iee v F tlv ) j. (19.28) 

In two dimensions, trfy^'y 0 ] = 2e QA *. Thus, 

9,r 5 = symm lim { 2^f-}e» a F va . (19.29) 

27T e—>0 1 C- I 

Now take the symmetric limit according to the prescription (19.23). We find 
precisely the anomalous nonconservation equation (19.18). In this derivation, 
(19.18) appears as an operator relation, rather than in a simple matrix ele¬ 
ment. Notice that, as in our first derivation of this equation, the assumption 
of local gauge invariance played a crucial role. If we had defined the axial vec¬ 
tor current by reversing the sign of the Wilson line in (19.22), a prescription 
that would have done violence to local gauge invariance, we would have found 
the various contributions canceling on the right-hand side of (19.29). 



19.1 Tlie Axial Current in Two Dimensions 


657 


An Example with Fermion Number Nonconservation 

To complete our discussion of the two-dimensional axial vector current, we 
will show that the nonconservation equation (19.18) also has a global aspect. 
In free fermion theory, the integral of the axial current conservation law gives 

I d 2 x d lt f 5 =N r -N l = 0. (19.30) 

This relation implies that the difference in the number of right-moving and 
left-moving fermions cannot be changed in any possible process. Combining 
this with the conservation law for the vector current, we conclude that the 
number of each type of fermion is separately conserved. From (19.8), we might 
conclude that these separate conservation laws hold also in two-dimensional 
QED. However, we have already found that we must be careful in making 
statements about the axial current. 

In two-dimensional QED, the conservation equation for the axial current 
is replaced by the anomalous nonconservation equation (19.18). If the right- 
hand side of this equation were the total derivative of a quantity falling off 
sufficiently rapidly at infinity, its integral would vanish and we would still 
retain the global conservation law. In fact, is a total derivative: 

e^F, v =2d^ v A v ). (19.31) 

However, it is easy to imagine examples where the integral of this quantity- 
does not vanish, for example, a world with a constant background electric 
field. In such a world, the conservation law (19.30) must be violated. But how 
can this happen? 

Let us analyze this problem by thinking about fermions in one space 
dimension in a background A 1 field that is constant in space and has a very- 
slow time dependence. We will assume that the system has a finite length L, 
with periodic boundary conditions. Notice that the constant A 1 field cannot 
be removed by a gauge transformation that satisfies the periodic boundary 
conditions. One way to see this is to note that the system gives a nonzero 
value to the Wilson line 


L 

exp iejdx Ai (x) J, (19.32) 

o 

which forms a gauge-invariant closed loop due to the periodic boundary con¬ 
ditions. 

Following the derivation of the three-dimensional Hamiltonian, Eq. (3.84), 
we find that the Hamiltonian of this one-dimensional system is 

H = J dx f •' ( in ' l)i ) 


(19.33) 
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where a = 7 °j 1 = 7 s . In the components (19.7), 

H = Jclx | — iip\_ (<9i — ieA 1 )xb + + iibl_ (<9i — faA 1 )^-j- (19.34) 

For a constant A 1 field, it is easy to diagonalize this Hamiltonian. The eigen¬ 
states of the covariant derivatives are wavefunctions 

e’ knX , with k n = —j—, n = — 00 ,..., 00 , (19.35) 

1j 


to satisfy the periodic boundary conditions. Then the single-particle eigen¬ 
states of H have energies 


l§+ : E n = +(k n -eA 1 ), 

ip- : (/,•„ c.l 1 ). 


(19.36) 


Each type of fermion has an infinite tower of equally spaced levels. To find 
the ground state of H , we fill the negative energy levels and interpret holes 
created among these filled states as antiparticles. 

Now adiabiatically change the value of A 1 . The fermion energy levels 
slowly shift in accord with the relations (19.36). If A 1 changes by the finite 
amount 

AA 1 = (19.37) 


which brings the Wilson loop (19.32) back to its original value, the spectrum 
of H returns to its original form. In this process, each level of ib + moves down 
to the next position, and each level of ib- moves up to the next position, as 
shown in Fig. 19.2. The occupation numbers of levels should be maintained in 
this adiabatic process. Thus, remarkably, one right-moving fermion disappears 
from the vacuum and one extra left-moving fermion appears. At the same time, 


{ drx - / 


dtdx — 8 qA\ 
7r 




7T 

= -2 


(19.38) 


where we have inserted (19.37) in the last line. Thus the integrated form of 
the anomalous nonconservation equation (19.18) is indeed satisfied: 

Nr - N l = jcfx . ( 19.39) 

Even in this simple example, we see that it is not possible to escape 
the question of ultraviolet regularization in analyzing the chiral conservation 
law. Right-moving fermions are lost and left-moving fermions appear from the 
depths of the fermionic spectrum, E —y — 00 . In computing the changes in the 
separate fermion numbers, we have assumed that the vacuum cannot change 
the charge it contains at large negative energies. This prescription is gauge 
invariant, but it leads to the nonconservation of the axial vector current. 
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Figure 19.2. Effect on the vacuum state of the Hamiltonian H of one¬ 
dimensional QED due to an adiabatic change in the background A 1 field. 


19.2 The Axial Current in Four Dimensions 

All of the derivations we have just given for the two-dimensional axial current 
have analogues in four dimensions. In Eq. (3.40), we showed that, in the case 
of massless fermions, the four-dimensional Dirac equation splits neatly into 
separate equations for left- and right-handed fermions. If we couple the Dirac 
equation to a gauge field, we replace derivatives by covariant derivatives. This 
does not seem to affect the manifest separation between the two helicity com¬ 
ponents. Thus it seems clear that both the vector and axial vector currents 
should remain conserved. However, after the analysis we have just completed 
for the two-dimensional case, we know that we should not take these conser¬ 
vation laws for granted. We will now make a more careful analysis of the axial 
vector conservation law in four dimensions. 

The Axial Vector Current Operator Equation 

We begin with the case of massless four-dimensional QED. Of the three ar¬ 
guments that we gave in the previous section for the two-dimensional axial 
current conservation law, the operator derivation generalizes most easily. The 
fermion field equations (19.21) are identical in the four-dimensional case. We 
can again adopt the gauge, invariant definition of the axial vector current 
(19.22). When we take the divergence of this current, all of the manipulations 
leading to Eq. (19.25) are still correct. 

From this point, we must compute the singular terms in the operator 
product of the two fermion fields in the limit e —)■ 0. As in two dimensions, 
the leading term is given by contracting the two operators using a free-field 
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Figure 19.3. Expansion of 4’{y)i’{z) in the presence of a background gauge 
field. 

propagator. This contribution gives 


= (j i-:) 1 ) 

= -i 7 a {y - ~)a 

27T 2 (y — z) 4 


(19.40) 


This is highly singular as (y — z) —> 0, but it gives zero when traced with 
7 ;, 7 a . To find a nonzero result, we must consider terms of higher order in the 
expansion of the product of operators. 

In a nonzero background gauge field, the contraction of fermion fields 
is given by the series of diagrams shown in Fig. 19.3. We have computed the 
leading term in this series in (19.40). The higher terms give less singular terms 
as (y — z) —1 0. The second term in the series is given by 


f ,r ' k ,r 'l' p -Uk+p)-yJk-z »(#+iQ 

J (2n) 4 (2n) 4 (k + p) 2 


(-ie£(p))jf. 


(19.41) 


This contribution leads to 


(' •(.r + f - f)) 


f d 4 k d 4 p ik 

J (2n) 4 (2 tt) 4 

f d k d p , ik-< . — rp-x 

J (2n) 4 (2iry e 


tr 


<^#)#± 4 h^)) 


(k + p) 2 
hr"' 1 (/,••//).,,1 :(i>)k- 

k 2 (k + p) 2 


iljl 

k 2 

(19.42) 


To evaluate the limit e —)■ 0, we can expand the integrand for large k. Then 

(ci-r + - §)> ~ 4ee ,1Q ^ 7 J^A dp) J 

= 4e^"(a a ^(P)A(_i_ logl) 

= 2 ee^F aP (x) J) . (19.43) 
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Figure 19.4. Diagrams contributing to the two-photon matrix element of 
the divergence of the axial vector current. 

Substituting this expression into (19.25), we find 

= symm lim j JL e a ^'F aP (^) {-iee v F^) J . (19.44) 

Now take the symmetric limit e —i 0 in four dimensions. We find 

d „r 5 = (19.45) 

This equation, which expresses the anomalous nonconservation of the four¬ 
dimensional axial current, is known as the Adler-Bell-Jackiw anomaly. Adler 
and Bardeen proved that this operator relation is actually correct to all orders 
in QED perturbation theory and receives no further radiative corrections.i 


Triangle Diagrams 


We can confirm the Adler-Bell-Jackiw relation by checking, in standard per¬ 
turbation theory, that the divergence of the axial vector current has a nonzero 
matrix element to create two photons. To do this, we must analyze the matrix 
element 


Jd 4 xe (p,k\j f,5 (x) |0) = (27 T) 4 8 {4] (p+k-q)e* I/ (p)e* x (k)M^ X (p,k). 

(19.46) 

The leading-order diagrams contributing to M , “ /X are shown in Fig. 19.4. The 
first diagram gives the contribution 



{£ - k ) 2 / P 1 (l + p ) 2 


(19.47) 


and the second diagram gives an identical contribution with ( p , v) and (k, A) 
interchanged. 

It is easy to give a formal argument that the matrix element of the di¬ 
vergence of the axial current vanishes at this order. Taking the divergence of 
the axial current in (19.46) is equivalent to dotting this quantity with iq^. 


iS. Adler and W. A. Bardeen, Phys. Rev. 182, 1517 (1969); S. Adler, in Deser, 
et. al. (1970). 
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Now we operate on the right-hand side of (19.47) as we do to prove a Ward 
identity. Replace 

= (f + fj -f + ^ 7 5 = (f + ^ 7 5 + 7 5 (f- If). (19.48) 

Each momentum factor combines with the numerator adjacent to it to cancel 
the corresponding denominator. This brings (19.47) into the form 


.49) 

Now pass 7 " through j a in the second term and shift the integral over the 
first term according to £ —1 (£ + k): 


(19.50) 

This expression is manifestly antisymmetric under the interchange of (p, v ) 
and (k, A), so the contribution of the second diagram in Fig. 19.4 precisely 
cancels (19.47). 

However, because this derivation involves a shift of the integration vari¬ 
able, we should look closely at whether this shift is allowed by the regulariza¬ 
tion. From (19.47), we see that the integral that must be shifted is divergent. If 
the diagram is regulated with a simple momentum cutoff, or even with Pauli- 
Villars regularization, it turns out that the shift leaves over a finite, nonzero 
term. In Chapter 7, we encountered a similar problem in our discussion of the 
QED vacuum polarization diagram. We evaded the problem there by using di¬ 
mensional regularization. Dimensional regularization of the diagrams of Fig. 
19.4 will automatically insure the validity of the QED Ward identities for the 
photon emission vertices, 

p„M» uX = k\M ,ll ' x = 0. (19.51) 




= e‘ 


/ 


d 4 £ 


■tr 


P 1 (£ + k) 2 1 


f 


■7 777 


P (£+ p) 2 


iQ t , ' 


7 


d 4 £ 

C2n) 4 


■tr 


(f- 


7 777 


{£ - k) 2 ' P 


+ 7'V^7' 


f,, (f+y) 

(£ + p) 2 


But in the analysis of the axial vector current, even dimensional regularization 
has an extra subtlety, because 7 0 is an intrinsically four-dimensional object. 
In their original paper on dimensional regularization,+ ‘t Hooft and Veltman 
suggested using the definition 


7 0 = * 7 °7 1 7 2 7 3 


(19.52) 


in d dimensions. This definition has the consequence that 7 5 anticommutes 
with 7 /J for p = 0,1,2, 3 but commutes with 7 ^ for other values of p. 

In the evaluation of (19.47), the external indices and the momenta p, 
k, q all live in the physical four dimensions, but the loop momentum £ has 
components in all dimensions. Write 


£ — £\\ + £±, 


(19.53) 


*G. | Hooft and M. J. G. Veltman, Nucl. Phvs. B44, 189 (1972). 



19.2 Tlie Axial Current in Four Dimensions 


663 


where the first term has nonzero components in dimensions 0 , 1 , 2 ,3 and the 
second term has nonzero components in the other cl —4 dimensions. Because 7 0 
commutes with the 7 '' in these extra dimensions, identity (19.48) is modified 
to 

q„ 7"7 5 = (f+ W7 5 + 7®(f- 4) ~ 27 5 fx- (19-54) 


The first two terms cancel according to the argument given above; the shift 
in (19.50) is justified by the dimensional regularization. However, the third 
term of (19.54) gives an additional contribution: 


iq» ■ 


= e 


s 


dH 


■tr 



(?-¥) .A t v (4+4) 
(f-fc ) 27 P 1 (e + p) 2 


(19.55) 


To evaluate this contribution, combine denominators in the standard way, and 
shift the integration variable ( —> l + P, where P = xk — yp. In expanding 
the numerator, we must retain one factor each of 7 ", y A , pi, and ^ to give a 
nonzero trace with 7 0 . This leaves over one factor of and one factor of $ 
which must also be evaluated with components in extra dimensions in order 
to give a nonzero integral. The factors l/ ± anticommute with the other Dirac 
matrices in the problem and thus can be moved to adjacent positions. Then 
we must evaluate the integral 


[ dn r ± r ± 

J (2 tt) 4 (P - A ) 3 ’ 

where A is a function of k, p, and the Feynman parameters. Using 

(d~ 4) 


(f ±) 2 = t-i -J- 


d 


under the symmetrical integration, we can evaluate (19.56) as 
i (d — 4) r(2-f) -i 


( 47 r ) d / 2 


r(3)A 2 “ d / 2 d- 4 4 2(4t r) 2 ' 


(19.56) 


(19.57) 


(19.58) 


Notice the behavior in which a logarithmically divergent integral contributes 
a factor (d — 4) in the denominator and allows an anomalous term, formally 
proportional to (d — 4), to give a finite contribution. The remainder of the 
algebra in the evaluation of (19.55) is straightforward. The terms involving 
the momentum shift P cancel, and we find 


iq» ■ 




47r 2 


E aX ^k, 


aPj3 • 


(19.59) 
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This term is symmetric under the interchange of (p, v) with (k, A), so the 
second diagram of Fig. 19.4 gives an equal contribution. Thus, 

(P,k \d^ 5 (0) |0) = -^e al ' 0X (-ip a )et(p)(-ik g )e* x (k) 

/7r ; (19.60) 

= -^ I (p,k\e ai 'i 3X F av F f}X (0)\0), 

as we would expect from the Adler-Bell-Jackiw anomaly equation. 


Chiral Transformation of the Functional Integral 


A third way of understanding the Adler-Bell-Jackiw anomaly comes from an¬ 
alyzing the conservation law for the axial vector current from the functional 
integral for the fermion field. In Section 9.6, we used the functional integral 
to derive the current conservation equations and the Ward identities associ¬ 
ated with any symmetry of the Lagrangian. It is instructive to see how this 
argument breaks down when we apply it to the chiral symmetry of massless 
fermions. 

We first review the standard derivation of the axial vector Ward identities 
following the method of Section 9.6. Starting from the fermionic functional 
integral 

Z= I ThpDipexp^i fd 4 xip{i]ft)ip^, (19.61) 


make the change of variables 

ip{x) —> ip'(x) = (1 + ia(x)y 5 )ip(x), 
ip{x) —> ip'(x) = ip( 1 + ia(x)'i°). 


(19.62) 


Since the global chiral rotation, with constant a , is a symmetry of the La¬ 
grangian, the only new terms in the Lagrangian that result from (19.62) con¬ 
tain derivatives of a. Thus, 


I d 4 x ip 1 (ip)xp' 


j d 4 x[ip(ifl)tp — d tl a(x)ipy IJ 'y :j ip] 


jd 4 x[ip(ifl)ip + a{x)d /J (ipj IJ "f 0 ip)]. 


(19.63) 


Then, by varying the Lagrangian with respect to a(x), we derive the classical 
conservation equation for the axial current. By carrying out a similar manipu¬ 
lation on the functional expression for a correlation function, as in Eq. (9.102), 
we would derive the associated Ward identities. 

In the argument just given, we assumed that the functional measure does 
not change when we change variables from ip’{x) to ip. This seems reasonable, 
because the relation of ip 1 and ip in (19.62) looks like a unitary transformation. 
However, we should examine this point more closely.* First, we must carefully 


*K. Fujikawa, Phvs. Rev. Lett. 42, 1195 (1979); Plivs. Rev. D21, 2848 (1980). 
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define the functional measure. To do this, expand the fermion field in a basis 
of eigenstates of ip. Define right and left eigenvectors of Ip by 

(/ ip)pm — 'b?? Pm(pm P'^P) — / 1) ^(p ln ~p — A m <p m . (19.64) 

For zero background field, these eigenstates are Dirac wavefunctions of 
definite momentum; the eigenvalues satisfy 

= k 2 = (k 0 ) 2 - (k) 2 . (19.65) 

For a fixed field, this is also the asymptotic form of the eigenvalues for 
large k. These eigenfunctions give us a basis that we can use to expand ip and 

i>- 

ip(x) = P{x) = y^a m 4> m (x), (19.66) 

m m 

where a m , a m are anticommuting coefficients multiplying the c-number eigen¬ 
functions (19.64). The functional measure over ip, ip can then be defined as 

VipUip = Y\_da m da mf; (19.67) 

m 

and the functional measure over ip', ip' can be defined in the same way. 

If ip'{x) = (1 + ia{x)j 5 )ip(x), the expansion coefficients of ip and ip 1 are 
related by a infinitesimal linear transformation (1 + C), computed as follows: 

a'm = / d 4 x<p] n (x)(l + ia{xyp)<p n {x)a n = ^( 6 mn + C mn )a n . (19.68) 

n n 

Then 

ViP'm' = J-' 2 ■ VipVxp, (19.69) 

where J is the Jacobian determinant of the transformation (1 + C). The 
inverse of J appears in (19.69) as a result of the rule (9.63) or (9.69) for 
fermionic integration. To evaluate J, we write 

J = det(l + C) = exp[trlog(l + C)] = exp [y^ C nn + •••], (19.70) 

n 

and we can ignore higher order terms in the last line because C is infinitesimal. 
Thus, 

log J = i f d 4 x a{x) ^ 4>\{x)rf(p n {x). (19.71) 

^ n 

The coefficient of a{x) looks like tr[y 5 ] = 0. However, we must regularize the 
sum over eigenstates n in a gauge-invariant way. The natural choice is 

^(plixyppnix) = A lim_ ^ <p{{x)-/ 5 <p n {x)e x « /M2 . 

n n 


(19.72) 
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As (19.65) indicates, the sign of will be negative at large momentum after 
a Wick rotation; thus, the sign in the exponent of the convergence factor is 
given correctly. We can write (19.72) in an operator form 

^2 <t>U x h 5 <t>n(x) = Jim ^0lXxW°e UP) ~ /M ~ Mx) 

z —‘ M—yea z —' 

n n (19.73) 

= lim ( ;C |tr[ 7 5 e ( ^ )2 / M2 ] \x) , 


where, in the second line, we trace over Dirac indices. 

To evaluate (19.73), we rewrite (ilft) 2 according to (16.107). In our present 
conventions, this equation reads 

m 2 = -D 2 + 1*7 (19.74) 


with a' 11 ' = 4 [7 M i l v }- Since we are taking the limit M —> oo, we can con¬ 
centrate our attention on the asymptotic part of the spectrum, where the 
momentum k is large and we can expand in powers of the gauge field. To ob¬ 
tain a nonzero trace with 7 5 , we must bring down four Dirac matrices from 
the exponent. The leading term is given by expanding the exponent to order 
(a ■ F ) 2 , and then ignoring the background A M field in all other terms. This 
gives 


lim (x\tr[ 1 6 e ( - v2 + ie ^- F) / M2 ] \x) 


M—too 


= lim tr 

oo 




2 


(•<)) <• 


,t| e “ 92 / M2 |x). 


(19.75) 


The matrix element in (19.75) can be evaluated by a Wick rotation: 

d 4 k 


{x\e 


-d 2 /M 2 


) = lim / 

x^y J 

-* / 


(27r) 4 


0 — ik-(x—y)k 2 /M 2 


d: l k E -k 2 E /M 2 


M 4 


167T 2 


Then (19.75) reduces to 


— ie- 


lim „ „ „ „ 
M—>oo 8 • 167T 2 


M 4 tr [ 7 (x) 


32tt 2 




Thus, 


(19.76) 


(19.77) 


/ 2 

d 4 x a(x) F n" F \a(x ))] • (19.78) 
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In all, we find that, after the change of variables (19.62), the functional integral 
(19.61) takes the form 


Z = / 'D'ljj'Dtjj exp 


iI d 4 x(p(i]/))ip + a(x)^d^ l j liS 


+ 




(19.79) 

Varying the exponent with respect to a(x), we find precisely the Adler-Bell- 
Jackiw anomaly equation. 

This derivation of the axial vector anomaly is especially interesting be¬ 
cause it generalizes readily to any even dimensionality. The functional deriva¬ 
tion always picks out for the right-hand side of the anomaly equation the 
pseudoscalar operator built from gauge fields that has the same dimension, d, 
as the divergence of the current. In two dimensions, this derivation leads im¬ 
mediately to (19.18). As long as d is even, we can always construct a matrix 
7 ° that anticommutes with all of the Dirac matrices by taking their product. 
Then, the functional derivation leads straightforwardly to the result 


o.r = (-i) n+1 - 


2 e" 


!,!(47r)' 


n zp 

, fc A 


H2n-lH2n 1 


(19.80) 


where n = d/2. 

At the end of the previous section, we argued that the axial vector 
anomaly leads to global nonconservation of fermionic charges in a two- 
dimensional system with a macroscopic electric field. In the same way, the 
four-dimensional anomaly equation leads to global nonconservation of the 
number of left- and right-handed fermions in background fields in which the 
right-hand side of (19.45) is nonzero. These are field configurations with par¬ 
allel electric and magnetic fields. In Problem 19.1, we work out an example 
of four-dimensional massless fermions in a simple situation of this type and 
show that the fermion numbers are indeed violated, in a manner similar to 
what we saw at the end of Section 19.1, in accord with the Adler-Bell-Jackiw 
anomaly. 


19.3 Goldstone Bosons and Chiral Symmetries in QCD 

The Adler-Bell-Jackiw anomaly has a number of important implications for 
QCD. To describe these, we must first discuss the chiral symmetries of QCD 
systematically. In this discussion, we will ignore all but the lightest quarks u 
and d. In many analyses of the low-energy structure of the strong interactions, 
one also treats the s quark as light; this gives results that naturally generalize 
the ones we will find below. 

The fermionic part of the QCD Lagrangian is 

L = uipu -f diipd — m u uu — niddd . (19.81) 

If the u and d quarks are very light, the last two terms are small and can 
be neglected. Let us study the implications of making this approximation. If 
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we ignore the u and d masses, the Lagrangian (19.81) of course has isospin 
symmetry, the symmetry of an SU{ 2) unitary transformation mixing the u 
and d fields. However, because the classical Lagrangian for massless fermions 
contains no coupling between left- and right-handed quarks, this Lagrangian 
actually is symmetric under the separate unitary transformations 


CO.^-CO.- (l982) 

It is useful to separate the U (1) and SU( 2) parts of these transformations; then 
the symmetry group of the classical, massless QCD Lagrangian is SU( 2) x 
SU{2) x U{1) x L T (1). Let Q denote the quark doublet, with chiral components 




Then we can write the currents associated with these symmetries as 


Jl ~ QlY'Ql, j R — QrYQr, 
jf = QlY^Ql, f R = Q R Y‘T a Q R , 


(19.84) 


where r° = cr a /2 represent the generators of SU( 2). The sums of left- and 
right-handed currents give the baryon number and isospin currents 


f = QYQ, = QY‘T a Q. (19.85) 


The corresponding symmetries are the transformations (19.82) with Ul = U R . 
The differences of the currents (19.84) give the corresponding axial vector 
currents j^ 5 , ji’ oa \ 

f 5 = Q 7 V<2, f ha = QY‘7 5 T a Q. (19.86) 


In the discussion to follow, we will derive conclusions about the strong inter¬ 
actions by assuming that the classical conservation laws for these currents are 
not spoiled by anomalies. We will show below that this assumption is correct 
for the isotriplet currents j^ 5a but not for j f,s . 

The vector SU('2) x U(l) transformations are manifest symmetries of the 
strong interactions, and the associated currents lead to familiar conservation 
laws. What about the orthogonal, axial vector, transformations? These do 
not correspond to any obvious symmetry of the strong interactions. In 1960, 
Nambu and Jona-Lasinio hypothesized that these are accurate symmetries of 
the strong interactions that are spontaneously broken. 1 ' This idea has led to 
a correct and surprisingly detailed description of the properties of the strong 
interactions at low energy. 


iy. Nambu and G. Jona-Lasinio, Phvs. Rev. 122, 345 (1961). 
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Figure 19.5. A quark-antiquark pair with zero total momentum and angular 

momentum. 

Spontaneous Breaking of Chiral Symmetry 

Before we describe the consequences of spontaneously broken chiral symmetry, 
let us ask why we might expect the chiral symmetries to be spontaneously 
broken in the first place. In the theory of superconductivity, a small electron- 
electron attraction leads to the appearance of a condensate of electron pairs 
in the ground state of a metal. In QCD, quarks and antiquarks have strong 
attractive interactions, and, if these quarks are massless, the energy cost of 
creating an extra quark-antiquark pair is small. Thus we expect that the 
vacuum of QCD will contain a condensate of quark-antiquark pairs. These 
fermion pairs must have zero total momentum and angular momentum. Thus, 
as Fig. 19.5 shows, they must contain net chiral charge, pairing left-handed 
quarks with the antiparticles of right-handed quarks. The vacuum state with 
a quark pair condensate is characterized by a nonzero vacuum expectation 
value for the scalar operator 

<0| QQ |0) = <0| Q l Qr + Q r Q l |0) # 0, (19.87) 

which transforms under (19.82) with Ul ^ Ur. The expectation value signals 
that the vacuum mixes the two quark helicities. This allows the u and d quarks 
to acquire effective masses as they move through the vacuum. Inside quark- 
antiquark bound states, the u and d quarks would appear to move as if they 
had a sizable effective mass, even if they had zero mass in the original QCD 
Lagrangian. 

The vacuum expectation value (19.87) signals the spontaneous breaking 
of the full symmetry group (19.82) down to the subgroup of vector symmetries 
with Ul = Ur. Thus there are four spontaneously broken continuous symme¬ 
tries, associated with the four axial vector currents. At the end of Section 
11 .1, we proved Goldstone’s theorem, which states that every spontaneously 
broken continuous symmetry of a quantum field theory leads to a massless 
particle with the quantum numbers of a local symmetry rotation. This means 
that, in QCD with massless u and d quarks, we should find four spin-zero 
particles with the correct quantum numbers to be created by the four axial 
vector currents. 

The real strong interactions do not contain any massless particles, but 
they do contain an isospin triplet of relatively light mesons, the pions. These 
particles are known to have odd parity (as we expect if they are quark- 
antiquark bound states). Thus, they can be created by the axial isospin cur¬ 
rents. We can parametrize the matrix element of j^ 5a between the vacuum 
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and an on-shell pion by writing 

<0| j^ 5a (x) |t T b (p)) = —ip p f w 6 ab e~ ip ' x , (19.88) 

where a, b are isospin indices and is a constant with the dimensions of 
(mass) 1 . We show in Problem 19.2 that the value of f n can be determined from 
the rate of i r + decay through the weak interaction; one finds f w = 93 MeV. 
For this reason, f w is often called the pion decay constant. If we contract 
(19.88) with pn and use the conservation of the axial currents, we find that 
an on-shell pion must satisfy p 2 = 0, that is, it must be massless, as required 
by Goldstone’s theorem. 

If we now restore the quark mass terms in (19.81), the axial currents are 
no longer exactly conserved. The equation of motion of the quark field is now 


iTpQ = m Q, = Q m, (19.89) 


where 



0 

m d 


is the quark mass matrix. Then one can readily compute 

d^j ,j5a =iQ{ui,r a }Q. 

Using this equation together with (19.88), we find 

<0|3^ 5a (0) lAp)) = -p 2 US ab = (0| iQ{m,r a } 7 5 Q |tt 6 (p)} . 

The last expression is an invariant quantity times 

tr[{m,r°}r 6 ] = § 6 ab {m u + m d ). 


Thus, the quark mass terms give the pions masses of the form 

o , , M' 2 

m; = (m u + m d ) — . 

J7T 


(19.90) 

(19.91) 

(19.92) 

(19.93) 

(19.94) 


The mass parameter M has been estimated to be of order 400 MeV. Thus, 
to give the observed pion mass of 140 MeV, one needs only (m u + m d ) ~ 10 
MeV. This is a small perturbation on the strong interactions. 

This argument has an interesting implication for the nature of the isospin 
symmetry of the strong interactions. In the limit in which the u and d quarks 
have zero mass in the Lagrangian, these quarks acquire large, equal effective 
masses from the vacuum with spontaneously broken chiral symmetry. As long 
as the masses m u and m d in the Lagrangian are small compared to the effec¬ 
tive mass, the u and d quarks will behave inside hadrons as though they are 
approximately degenerate. Thus the isospin symmetry of the strong interac¬ 
tions need have nothing to do with a fundamental symmetry linking u and d ; 
it follows for any arbitrary relation between m u and m d , provided that both 
of these parameters are much less than 300 MeV. Similarly, the approximate 
SU( 3) symmetry of the strong interactions follows if the fundamental mass of 
the s quark is also small compared to the strong interaction scale. The best 
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Figure 19.6. Matrix element of the axial isospin current in the nucleon: (a) 
kinematics of the amplitude; (b) contribution that leads to a pole in q 2 . 

current estimates of the mass ratios m u : : m s are in fact 1 : 2 : 40, so 

that the fundamental Lagrangian of the strong interactions shows no sign of 
flavor symmetry among the quark masses. + 

The identification of the pions as Goldstone bosons of broken chiral sym¬ 
metry has a number of implications for hadronic matrix elements. Here we 
will give only one example. In the following argument, we will work in the 
limit of exact chiral symmetry, ignoring the small corrections from the u and 
cl masses. 

The matrix element of the axial isospin current in the nucleon, a quantity 
that enters the theory of neutron and nuclear j3 decay, can be written in terms 
of form factors as follows: 

(N\f 5a (q) \N) = u[y^ 5 F?(q 2 ) + ^^F®(g 2 ) + rfF 3 V)]u. (19.95) 

The kinematics of the vertex is shown in Fig. 19.6. Notice that there is one 
more possible form factor than in the vector case, Eq. (6.33). The value of F® 
at q 2 = 0 is not restricted by the value of any manifestly conserved charge. 
Conventionally, one writes simply 

F 1 5 (0)= 5 .4. (19.96) 

However, we will now show that the value of this quantity can be computed. 

If we ignore quark masses, the axial vector current in (19.95) is conserved, 
so the form factors satisfy 


0 = u(p') 
= u(p') 
= u(p') 


s/fW) + aVW) 


i{p) 


(rf - lt)l 5 FHq 2 ) + q 2 l 5 F!(q 2 ) u(p ) 


2m N y° F® ( q-) + q 2 y 5 F® (q ~) u ( p ). 


(19.97) 


+ Tlie determination of the fundamental quark masses is reviewed by J. Gasser 
and H. Leutwyler, Phvs. Repts. 8, 77 (1982). 
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Thus, we find 

9A = lim ^—Fl(q 2 ). (19.98) 

q-—y o 2m jv 

This equation implies that qa = 0 unless F| contains a pole in q 2 . Such a 
pole would imply the presence of a physical massless particle, but fortunately, 
there is one available—the massless pion. The process in which the current 
creates a pion that is then absorbed by the nucleon indeed leads to a pole in 
F^(q 2 ), as shown in Fig. 19.6(b). 

Let us now compute this pole term and use it to determine g y \. The 
low-energy pion-nucleon interaction is conventionally parametrized by the La- 
grangian 

A£ = ig wNN Tr a Ny 5 (j a N. (19.99) 

The amplitude for the current j^ 5a to create the pion is given by (19.88). 
Then the contribution of Fig. 19.6(b) to the current vertex is 

-g JT NNu(‘2T a 'y°)u ■ -4 • {iq^U). (19.100) 

q- 

Tlius. 

Ff(q 2 ) = 1 • 2/^w (19.101) 

q- 

We find that gA is given by a combination of f w , the nucleon mass, and the 
pion-nucleon coupling constant: 

9A = —g.NN- (19.102) 

niN 

This strange identity, called the Goldberger-Trieman relation , is satisfied ex¬ 
perimentally to 5% accuracy. 

The identification of the pion as the Goldstone boson of spontaneously 
broken chiral symmetry leads to numerous other predictions for current matrix 
elements and pion scattering amplitudes. In particular, the leading terms of 
the pion-pion and pion-nucleon scattering amplitudes at low energy can be 
computed directly in terms of f n by arguments similar to one just given.* 

Anomalies of Chiral Currents 

Up to this point, we have discussed the chiral symmetries of QCD according 
to the classical current conservation equations. We must now ask whether 
these equations are affected by the Adler-Bell-Jackiw anomaly, and what the 
consequences of that modification are. 

To begin, we study the modification of the chiral conservation laws due 
to the coupling of the quark currents to the gluon fields of QCD. The ar¬ 
guments given in the previous section go through equally well in the case of 


*The detailed consequences of spontaneously broken chiral symmetry are worked 
out in a very clear manner in Georgi (1984). 
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Figure 19.7. Diagrams that lead to an axial vector anomaly for a chiral 
current in QCD. 

massless fermions coupling to a non-Abelian gauge field, so we expect that an 
axial vector current will receive an anomalous contribution from the diagrams 
shown in Fig. 19.7. The anomaly equation should be the Abelian result, sup¬ 
plemented by an appropriate group theory factor. In addition, since the axial 
current is gauge invariant, the anomaly must also be gauge invariant. That 
is, it must contain the full non-Abelian field strength, including its nonlinear 
terms. These terms are actually included in the functional derivation of the 
anomaly given at the end of Section 19.2. 

For the axial currents of QCD, written in (19.86), we can read the group 
theory factors for the Adler-Bell-Jackiw anomaly from the diagrams of Fig. 
19.7. For the axial isospin currents, 

d^ 5a = ■ tr [r a ft d ], (19.103) 

where is a gluon field strength, r° is an isospin matrix, t c is a color matrix, 
and the trace is taken over colors and flavors. In this case, we find 

tr [r a t c t d ] = tr[r“] tr [ft d ] = 0, (19.104) 

since the trace of a single r° vanishes. Thus the conservation of the axial 
isospin currents is unaffected by the Adler-Bell-Jackiw anomaly of QCD. How¬ 
ever, in the case of the isospin singlet axial current, the matrix r“ is replaced 
by the matrix 1 on flavors, and we find 

= ( l9 - 105 ) 

where n/ is the number of flavors; n/ =2 in our current model. 

Thus, the isospin singlet axial current is not in fact conserved in QCD. The 
divergence of this current is equal to a gluon operator with nontrivial matrix 
elements between hadron states. Some subtle questions remain concerning the 
effects of this operator. In particular, it can be shown, as we saw for the two- 
dimensional axial anomaly in Eq. (19.31), that the right-hand side of (19.105) 
is a total divergence. Nevertheless, again in accord with our experience in 
two dimensions, there are physically reasonable field configurations in which 
the four-dimensional integral of this term takes a nonzero value. This topic 
is discussed further at the end of Section 22.3. In any event, Eq. (19.105) 
indeed implies that QCD has no isosinglet axial symmetry and no associated 
Goldstone boson. This equation explains why the strong interactions contain 
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no light isosinglet pseudoscalar meson with mass comparable to that of the 
pions. 

Though the axial isospin currents have no axial anomaly from QCD in¬ 
teractions, they do have an anomaly associated with the coupling of quarks 
to electromagnetism. Again referring to the diagrams of Fig. 19.7, we see that 
the electromagnetic anomaly of the axial isospin currents is given by 

V 50 = -^< n ;/V • tr[r“Q 2 ], (19.106) 

where F^ v is the electromagnetic field strength, Q is the matrix of quark 
electric charges, 

° = (o -i)' (19107) 

and the trace again runs over flavors and colors. Since the matrices in the trace 
do not depend on color, the color sum simply gives a factor of 3. The flavor 
trace is nonzero only for a = 3; in that case, the electromagnetic anomaly is 

V* = (19.108) 

Because the current j ^ 53 annihilates a 7 r° meson, Eq. (19.108) indicates 
that the axial vector anomaly contributes to the matrix element for the decay 
7 T° —> 2 7 . We will now show that, in fact, it gives the leading contribution to 
this amplitude. Again, we work in the limit of massless u and d quarks, so 
that the chiral symmetries are exact up to the effects of the anomaly. 

Consider the matrix element of the axial current between the vacuum and 
a two-photon state: 

(p,k\j>‘ 53 (q) |0> = ele\M^ x {p,k). (19.109) 


This is the same matrix element (19.46) that we studied in QED perturbation 
theory in Section 19.2. Now, however, we will study the general properties of 
this matrix element by expanding it in form factors. In general, the amplitude 
can be decomposed by writing all possible tensor structures and applying 
the restrictions that follow from symmetry under the interchange of (p, v) 
and (k, A) and the QED Ward identities (19.51). This leaves three possible 
structures: 

M^ x = q l, e‘' Xad p a k 0 M 1 + (e l “' a3 k x - e^p v )k a p 0 Mt 

+ [(e^V - e^k^kcpp - e^ Xa (p - k) a p ■ k]M 3 ■ (19 ' 11 °' ) 

The second term satisfies (19.51) by virtue of the on-shell conditions p 2 = 
k 2 = 0 . 

Now contract (19.110) with (iq fl ) to take the divergence of the axial vector 
current. We find 


iq u M^ x = iq 2 e I/Xa ^p a kpM 1 - i^ vXa q^p - k) a p ■ kM 3 ; (19.111) 
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Figure 19.8. Contribution that leads to a pole in the axial vector current 
form factor M.\. 

the other terms automatically give zero. Using q = p + k, q 2 = 2 p ■ k, we can 
simplify this to 

iq^M^ x = h/V' Aa ''/UM.Vf, + M 3 ). (19.112) 

The whole quantity is proportional to q 2 and apparently vanishes in the limit 
q 2 —> 0. This contrasts with the prediction of the axial vector anomaly. Taking 
the matrix element of the right-hand side of (19.108), we find 

iq,M^ X = -^ Xa0 Pak 0 . (19.113) 

The conflict can be resolved if one of the form factors appearing in (19.112) 
contains a pole in q 2 . Such a pole can arise through the process shown in 
Fig. 19.8, in which the current creates a ir° meson which subsequently decays 
to two photons. The amplitude for the current to create the meson is given 
by (19.88). Let us parametrize the pion decay amplitude as 

iM(7r° —> 2y) = iAe*e x e vXal3 p a kp, (19.114) 


where A is a constant to be determined. Then the contribution of the process 
of Fig. 19.8 to the amplitude M f,l ' x defined in (19.109) is 

{W fw) 4- {iAe vXa0 p a k 0 ). (19.115) 

This is a contribution to the form factor M i, 





(19.116) 


plus terms regular at q 2 = 0. Now, by equating (19.112) to (19.113), we 
determine A in terms of the coefficient of the anomaly: 


A = 


_c^J_ 

4?r 2 f w ‘ 


(19.117) 


^From the decay matrix element (19.114), it is straightforward to work 
out the decay rate of the ir°. Note that, though we have worked out the decay 
matrix element in the limit of a massless i r°, we must supply the physically 
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correct kinematics which depends on the i r° mass. Including a factor 1/2 for 
the phase space of identical particles, we find 


r( 7 T° ->■ 27) 


1 1 1 

2 m 7T 8n 2 


y \M{n° —> 27 ) 

pols. 


1 

32 Tnn w 


■ A 2 



647T 


2{p-kf 


Thus, finally, 


r(7T° ->■ 27) 


9 2 

a 

64tt 3 fl ' 


(19.118) 


(19.119) 


This relation, which provides a direct measure of the coefficient of the Adler- 
Bell-Jackiw anomaly, is satisfied experimentally to an accuracy of a few per¬ 
cent. 


19.4 Chiral Anomalies and Chiral Gauge Theories 

Up to this point, we have coupled gauge fields to fermions in a parity- 
symmetric manner, replacing the derivative in the Dirac equation by a covari¬ 
ant derivative. This procedure couples the gauge field to the vector current of 
fermions. However, this procedure gives only a subset of the possible couplings 
of fermions to gauge bosons. In this section we will construct more general, 
parity-asymmetric, couplings and discuss their interplay with the axial vector 
anomaly. 

We will focus primarily on theories of massless fermions. If the Lagrangian 
contains no fermion mass terms, it has no terms that mix the two helicity 
states of a Dirac fermion. Thus, in a theory that contains massless Dirac 
fermions tpi, we can write the kinetic energy term in the helicity basis (3.36) 
as 

C = -ipljia ■ dtp Li + tp ] Ri io ■ dtb m . (19.120) 

There is no difficulty in coupling this system to a gauge field by assigning 
the left-handed fields tpu to one representation of the gauge group G and 
assigning the right-handed fields to a different representation. For example, 
we might assign the left-handed fields to a representation r of G and take the 
right-handed fields to be invariant under G. This gives 

C = tpii'ia- ■ Dtp Li + tp Ri ia ■ dtp Ri , (19.121) 

with D/j = d,j — igAptp. In more conventional notation, (19.121) becomes 


C = tPiY'(d,-igA-tp(^-))tP 


(19.122) 
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It is straightforward to verify that the classical Lagrangian (19.122) is invari¬ 
ant to the local gauge transformation 


i’ -t (l : l<[ " r {—]—))''■ 
a; a; + x -d,a a + r' ,r Ay. 


(19.123) 


which generalizes (15.46). Since the right-handed fields are free fields, we can 
even eliminate these fields and write a gauge-invariant Lagrangian for purely 
left-handed fermions. 

The idea of gauge fields that couple only to left-handed fermions plays a 
central role in the construction of a theory of weak interactions. The coupling 
of the W boson to quarks and leptons described in (17.31) can be derived by 
assigning the left-handed components of quarks and leptons to doublets of an 
SU('2) gauge symmetry 


«‘=00p £ ‘ = ())y i 19121 ) 

and then identifying the W bosons as gauge fields that couple to this SU(2) 
group. In this picture, it is the restriction of the symmetry to left-handed 
fields that leads to the helicity structure of the weak interaction effective 
Lagrangian. We will discuss a complete, explicit model of weak interactions, 
incorporating this idea, in the next chapter. 

To work out the general properties of chirally coupled fermions, it is useful 
to rewrite their Lagrangian with one further transformation. Below Eq. (3.38), 
we noted that the quantity transforms under Lorentz transformations as 
a left-handed field. Thus it is useful to rewrite the right-handed components 
in (19.120) as new left-handed fermions, by defining 

i’Li = ° 2 ipRi, tp'li = 'tpRiV 2 - (19.125) 

This transformation relabels the right-handed fermions as antifermions and 
calls their left-handed antiparticles a new species of left-handed fermions. By 
using (3.38), we can rewrite the Lagrangian for the right-handed fermions as 

j d 4 x^ m i(T ■ di/jRi = Jcfxxp'l-ia ■ dib' Li . (19.126) 

The minus sign from fermion interchange cancels the minus sign from inte¬ 
gration by parts. Notice that, if the fermions are coupled to gauge fields in 
the representation r, this manipulation changes the covariant derivative as 
follows: 

tf R i° ■ (9 - igA a t a r )ip R = ib'tio • (5 + igA a (t a r f }^ L 

(19.127) 

= ^ia-(d-igA a t%)^ L . 

Thus the new fields ib' L belong to the conjugate representation to r, for which 
the representation matrices are given by (15.82). In this notation, QCD with 
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iif flavors of massless fermions is rewritten as an SU( 3) gauge theory coupled 
to nf massless fermions in the 3 and iif massless fermions in the 3 represen¬ 
tation of SU( 3). The most general gauge theory of massless fermions would 
simply assign left-handed fermions to an arbitrary, reducible representation 
R of the gauge group G. We have just seen that rewriting a system of Dirac 
fermions leads to R = r©r, a real representation in the sense described below 
(15.82). Conversely, if R is not a real representation, then the theory cannot 
be rewritten in terms of Dirac fermions and is intrinsically chiral. 

The rewriting (19.125) transforms the mass term of the QCD Lagrangian 
as follows: 

rmbfipi = m(ip ] R ip L + h.c.) = ///(+ h.c.). (19.128) 

This has the form of the Majorana mass term that we encountered in Problem 
3.4. The most general mass term that can be built purely from left-handed 
fermion fields is 

A C M = + h.c. (19.129) 

The matrix My is symmetric under i -f-t j, since the minus sign from the an¬ 
tisymmetry of a 2 is compensated by a minus sign from fermion interchange. 
This mass term is gauge invariant if My is invariant under G. For example, 
the mass term in (19.128) couples 3 and 3 indices together in an SU( 3) singlet 
combination. In general, a gauge-invariant mass term exists if the represen¬ 
tation containing the fermions is strictly real , in the sense described below 
(15.82). In an intrinsically chiral theory, there is no possible gauge-invariant 
mass term. We will see in the next chapter that, in the gauge theory of the 
weak interactions, mass terms for the quarks and leptons are forbidden by 
gauge invariance. We will present a solution to this problem in Section 20.2. 

At the classical level, there is no restriction on the representation R of the 
left-handed fermions. However, at the level of one-loop corrections, many pos¬ 
sible choices become inconsistent due to the axial vector anomaly. In a gauge 
theory of left-handed massless fermions, consider computing the diagrams of 
Fig. 19.9, in which the external fields are non-Abelian gauge bosons and the 
marked vertex represents the gauge symmetry current 

f a = (19.130) 

The gauge boson vertices also contain factors of (1 — 7 °)/ 2 . The three pro¬ 
jectors can be moved together into a single factor. Then, if we regularize 
this diagram as in Section 19.2, the term containing a has an axial vector 
anomaly that leads to the relation 

<P, *7 b- k, A, c| d,r a |0) = J-j e avf>x p a k 0 ■ A abc , (19.131) 

where A abc is a trace over group matrices in the representation R: 

A abc = tr[t a {t b J c }]. 


(19.132) 
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Figure 19.9. Diagrams contributing to the anomaly of a gauge symmetry 
current in a chiral gauge theory. 

This equation implies that, unless A abc vanishes, the current j f,a is not con¬ 
served. The factor (19.132) is totally symmetric in ( a,b,c ), so this condition 
is independent of which current is treated as an external operator. As we de¬ 
scribed in Sections 19.1 and 19.2, we can change the regularization of the 
diagram so that the external current is conserved, but only at the price of 
violating the conservation of one of the other two currents in the diagram. 

Since the whole construction of a theory with local gauge invariance is 
based on the existence of an exact global symmetry, the violation of the con¬ 
servation of j ,ia does violence to the structure of the theory. For example, 
triangle diagrams of the form of Fig. 19.9 will now generate divergent gauge 
boson mass terms and will upset the delicate relations between three- and 
four-point vertices discussed in Chapter 16. These relations, following from 
the Ward identity, were necessary to insure the cancellation of unphysical 
states and the unitarity of the 5-matrix. The only way to avoid this prob¬ 
lem is to insist that A ahc = 0 as a fundamental consistency condition for 
chiral gauge theories^ Gauge theories satisfying this condition are said to be 
anomaly free. 

As an example of the application of this condition, consider the prototype 
weak interaction gauge theory that we presented in (19.124). If the two gauge 
bosons in Fig. 19.9 are SU('2) gauge bosons and the current is an SU( 2) 
gauge current, we would evaluate (19.132) by substituting t a = r“ = a a /2 
and using the relation {cr b ,a c } = 2 S bc . This gives 

A abc = ltr[a a -2S bc ] = 0, (19.133) 

8 

so the consistency condition is satisfied. If the fermions in (19.124) also cou¬ 
ple to electromagnetism, there is an additional consistency condition that we 
would find by taking the current in Fig. 19.9 to be the electromagnetic current. 
The factor A abc for this case is 


tr[Q{r 6 ,r c }], (19.134) 

where Q is the matrix of electric charges. If we simplify as in (19.133), the 
trace (19.134) becomes 

itr[Q]<5 6e . (19.135) 


I’D. J. Gross and R. Jackiw, Phvs. Rev. D6, 477 (1972). 
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This factor is proportional to the sum of the fermion electric charges, which 
does not vanish either for quarks or for leptons. However, if we sum over one 
quark doublet and one lepton doublet, with a factor 3 for colors, we find 

tr[Q] = 3 • (| — |) + (0 — 1) = 0. (19.136) 

Remarkably, the weak interaction gauge theory described by (19.124) can be 
consistently combined with QED only if the theory contains equal numbers 
of quark and lepton doublets. 

We complete this section by working out more generally the condition 
that a chiral gauge theory be anomaly free. We will first derive some basic 
properties of the anomaly factor A abc and then apply these to chiral gauge 
theories with simple gauge groups. 

If the fermion representation R is real, R is equivalent to its conjugate 
reprsentation R. Thus, as we described below (15.82), t a R is related by a uni¬ 
tary transformation to t a R = —(t R ) T . Since (19.132) is invariant to unitary 
transformations of the t a , we can replace t R by t a R . Then 

A abc = tr [(—t a ) T {(—t b ) T , M c ) T }] 

= -tr [{t c ff b }t a ] (19.137) 

_ _ j^abc 

Thus, if R is real, the gauge theory is automatically anomaly free. As a special 
case, any gauge theory of Dirac fermions is anomaly free. 

In more general circumstances, we can simplify the calculation of A abc by 
noting that it is an invariant of the gauge group G that is totally symmetric 
with three indices in the adjoint representation. For some possible groups, a 
suitable invariant may not exist, and in those cases A abc must vanish. For ex¬ 
ample, in SU(‘2) the adjoint representation has spin 1. The symmetric product 
of two spin-1 multiplets gives spin 0 plus spin 2, with no spin-1 component. 
Thus, there is no symmetric tensor coupling two spin-1 indices to give a spin 1. 
The factor (19.132) must then vanish in any SU('2) gauge theory. We saw this 
happen in an explicit example in Eq. (19.133). 

In SU{n ) groups, n > 3, there is a unique symmetric invariant d abc of the 
required type. It appears in the anticommutator of representation matrices of 
the fundamental representation: 

{C4.} = 5 S ab +d ab H c n . (19.138) 

The uniqueness of this invariant implies that, in an SU{n ) gauge theory, any 
trace of the form of (19.132) is proportional to d abc . For each representation 
r, we can define an anomaly coefficient A(r) by 

tr [ff,{t b r J c r }} = \A{r)d abc . (19.139) 

For the fundamental representation, we can see from (19.138) that A(n) = 1. 
It follows from the argument of (19.137) that 

A(r ) = — A(r). 


(19.140) 
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For higher representations, the anomaly coefficients can be worked out using 
methods similar to those we used in Section 15.4 to compute Co(r). For exam¬ 
ple, we show in Problem 19.3 that, if a and s are the SU(n ) representations 
corresponding to antisymmetric and symmetric two-index tensors, then 

A(a) = n - 4, A(s) = n + 4. (19.141) 

An SU(n ) gauge theory is anomaly free if the anomaly coefficients of the 
various irreducible components of the fermion multiplet R sum to zero. For 
example, the SU(n ) gauge theory of left-handed fermions with representation 
content 

R = a + (n-4)n (19.142) 

is anomaly free. 

Of the various simple Lie groups listed below (15.72), only SU(n ), 
50(4/i + 2), and E 6 have complex representations. Of these, only SU(n) and 
50(6), which has the same Lie algebra as 50(4), have a symmetric invariant 
of the type required to build the anomaly. Gauge theories based on 50(4/1+2), 
zi > 2, and on E 6 are automatically anomaly free. The groups 50(10) and E 6 
have been suggested as candidates for the grand unified gauge symmetry of 
particle physics, which we will discuss in Section 22.2. 

There is one further constraint on the representation content of a chiral 
gauge theory, which comes from considering its coupling to gravity. It is pos¬ 
sible to show that the diagrams of Fig. 19.9 give an anomaly contribution 
when computed with a gauge current j ^" 1 and external gravitational fields. 
The group-theory factor that multiplies this diagram is 

tr [tl\- (19.143) 

This factor automatically vanishes if the gauge group is non-Abelian. However, 
if the gauge group of the theory contains 17(1) factors, the theory cannot be 
consistently coupled to gravity unless each of the 17(1) generators is traceless.* 

Once we have constructed a consistent chiral gauge theory, we have an 
additional problem of finding a prescription for calculating in this theory con¬ 
sistently. In a vector-like gauge theory, we can define ultraviolet-divergent 
diagrams with dimensional regularization. This guarantees that the divergent 
diagrams will be regulated in a way that respects the Ward identities of lo¬ 
cal gauge invariance. To generalize dimensional regularization to chiral gauge 
theories, we need to introduce a dimensional continuation of j 5 . The ‘t Hooft- 
Veltman definition used to define the chiral current in Section 19.2 is not 
satisfactory, because this definition does not manifestly respect the conser¬ 
vation of the gauge currents. A useful alternative procedure is to define j 5 
formally as an object that anticommutes with all of the 7 + This prescription 
gives unambiguous, gauge-invariant results for amplitudes that are not pro¬ 
portional to e t “' Xa , at least through two-loop order. In Section 21.3, we will 


lL. Alvarez-Gaume and E. Witten, Nucl. Phvs. B234, 269 (1984). 
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use this prescription to compute loop diagrams in weak interaction theory. 
As a last resort, one can always compute with a non-gauge-invariant regula¬ 
tor and add non-gauge-invariant counterterms to the theory so that the gauge 
theory Ward identities remain valid. 


19.5 Anomalous Breaking of Scale Invariance 


There is one more important example of a symmetry that is an invariance at 
the classical level and is broken by quantum corrections. This is the classi¬ 
cal scale invariance of a massless field theory with a dimensionless coupling 
constant. In Chapter 12, we saw that a quantum field theory with no classi¬ 
cal dimensionful parameters still depends on a mass scale through the regu¬ 
larization of ultraviolet divergences, or, equivalently, through the running of 
coupling constants. We have already seen how to analyze this induced depen¬ 
dence on the renormalization scale using the Callan-Symanzik equation. In 
this section, we will show how the violation of classical scale invariance by 
quantum corrections can be described as a current conservation anomaly. 

In this book we have avoided giving a careful treatment of the energy- 
momentum tensor of a quantum field theory. In Section 2.2, we used Noether’s 
theorem to demonstrate that the invariance of a quantum field theory under 
spacetime translations implies the presence of a conserved tensor T^ v . In 
Section 9.6, we gave an alternative derivation of this result using the functional 
integral formalism. However, to discuss the theory of scale invariance, we will 
need some more detailed properties of the energy-momentum tensor. We will 
now simply state these properties and refer elsewhere for their derivations.* 

The tensor T 1 ' 1 ' defined by expressions (2.17) and (9.99) is called the 
canonical energy-momentum, tensor. The expressions that defined this tensor 
do not imply that is symmetric. In fact, this tensor need not be symmetric, 
and, in a gauge theory, it need not be gauge-invariant. However, it is always 
possible to convert T f “' into a symmetric and gauge-invariant tensor by 
the addition 

= T flv + d x ^"' x , (19.144) 


where antisymmetric under interchange of p and A. The form of the 

added term implies that 0''" is conserved if is, and that the global energy- 
momentum four-vector is unchanged, 


pv 




(19.145) 


A scale transformation of a scalar field theory can be defined as a trans¬ 
formation of variables 

<f>(x) —1 e~ Da (l>(xe~ a ), (19.146) 


*The conclusions presented in tlie next three paragraphs are derived with care in 
C. G. Callan, S. Coleman, and R. Jackiw, Ann. Phvs. 59, 42 (1970). 
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with D = 1, the canonical mass dimension of the field. The scale transforma¬ 
tion is defined similarly in theories of fermion and gauge fields. If this trans¬ 
formation is an invariance of the classical Lagrangian, as it will be if there are 
no dimensionful couplings, this theory will possess a conserved current D /J . 
called the dilatation current. The dilatation current has a simple relation to 
the symmetric energy-momentum tensor 0 ,1! h D )l = ©''"av, so that 

d fl D^ = Q%. (19.147) 

The derivation of these results from Noether’s theorem is not straightfor¬ 
ward. There is a simpler derivation, which, however, uses formalism beyond 
the scope of this book. If the quantum field theory under consideration is 
coupled to gravity, then the energy-momentum tensor can be identified as 
the source of the gravitational field. This energy-momentum tensor can be 
found by varying the Lagrangian C m of matter fields with respect to the 
spacetime metric g tw (x). This construction gives a manifestly symmetric and 
gauge-invariant tensor, which turns out to be Q ,n ': 

©^ = 2 [d 4 x C m . (19.148) 

og,^{x) J 

A scale transformation can be represented as a change in the spacetime metric 

9ixAx) -a e 2a guA x )- (19.149) 

Combining (19.148) and (19.149), we see that the change in the Lagrangian 
induced by this transformation is just the trace of 0^". This will be equal by 
Noether’s theorem to the divergence of the corresponding current, giving us 
back Eq. (19.147). 

In QED, it is not hard to guess the form of the symmetric energy- 
momentum tensor: 

©^ = -F» X F\ + \g tl AF\a)' 2 + ~ g^ipiip - m)xp. 

(19.150) 

This is a gauge-invariant symmetric tensor that leads to the familiar expres¬ 
sion for the total energy, 

H = Jd 3 x{±{E 2 +B 2 ) +^ t (-* 7 ° 7 - V + m)^}. (19.151) 

For future reference, we note that these results are true at the classical level 
in any spacetime dimension d. In four dimensions, the trace of the gauge field 
terms cancels automatically. After using the Dirac equation, which is valid as 
an operator equation of motion, we find that the trace of 0 #,!y is given by 

= m/tpif) (19.152) 

and indeed vanishes, classically, if m = 0. 

When quantum corrections are included, we know that a scale transfor¬ 
mation is not a symmetry of the theory, since the same theory referred to a 
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larger scale contains a different value of the renormalized coupling constant. 
The shift of the renormalized coupling is 

+ crf3(g), (19.153) 

and the corresponding change in the Lagrangian is 

op{9)j^-C- (19.154) 

og 

Thus, when quantum corrections are included, the equation for the dilatation 
current in a classically scale-invariant theory should read 

d»D» = <d\ = (3(g) |-£. (19.155) 

og 

In massless QED, we can write this formula most usefully by rescaling the 
gauge fields so that the coupling constant e is removed from the covariant 
derivative: eA^ —> A 1 *. Then e appears only in the term 

£= -^ f ^' 2 +■■■’ ( i9 - i5e ) 

and Eq. (19.155) reads 

0/ V = t^W) 2 - (19.157) 

This relation, which says that the trace of the symmetric energy-momentum 
tensor takes a nonzero value as a result of quantum corrections, is known as 
the trace anomaly. 

We should be able to check the trace anomaly equation (19.157) directly in 
perturbation theory. We now evaluate the trace of 0^ explicitly to one-loop 
order. The formalism we have set up is very similar to that of the background 
field calculation of the f3 function done in Section 16.6. As in that section, we 
will integrate over the fluctuating parts of quantum fields in the presence of 
a background field with a nonzero F Equation (19.157) predicts that this 
integration will lead to the expression 

<Q%> = c j-0- iA ,(-k)(k~g^ - kV)A„{k), (19.158) 

where A^ is the background field and the constant C is equal to d(e)/e 3 . 

Since we will be using dimensional regularization, we should begin by- 
writing the trace of Q f “' in d dimensions: 

e'V = -^W) 2 + (1 - (19.159) 

The one-loop matrix element of this quantity proportional to two powers of 
the background field arises from the three diagrams shown in Fig. 19.10. Since 
the second term on the right-hand side of (19.159) vanishes by the equation 
of motion of ip(x), one might expect that this term gives zero contribution 
to the trace. Indeed, it is easy to check that the first two diagrams in Fig. 
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Figure 19.10. One-loop diagrams contributing to the anomalous trace of 
0 f iv . 


19.10 cancel: These diagrams have the same structure, since the first has an 
extra propagator and an extra factor from the operator matrix element, and 
opposite overall signs. 

The first term on the left-hand side of (19.159) is unexpected, since 
it apparently vanishes in four dimensions. However, the fermion loop dia¬ 
gram is divergent, and in dimensional regularization, this introduces a factor 
1/(2 — d/2). As a result, this diagram gives a nonzero contribution to the op¬ 
erator matrix element. In massless QED, the fermion loop diagram has the 
value 

= -<{*V V 3(i^(r(M) + ( finite ))- ( 19 - 16 °) 


Then the complete expression for the third diagram in Fig. 19.10 is 


This is of the form of (19.158), with 



4 1 

3(4tt) 2 2 - d/2 


(k). 

(19.161) 




(19.162) 


which is indeed the first 3 function coefficient in massless QED. 

This discussion generalizes to QCD and other gauge theories. In a non- 
Abelian gauge theory, 0 M " is given by the obvious generalization of (19.150) 
with the Abelian field-strength tensor F IJM replaced by the non-Abelian ex¬ 
pression F“„. The trace of Q 1 ' 1 ' is again given by 




(19.163) 


plus terms that vanish by the equations of motion. In the background field 
gauge, the one-loop diagrams with the operator 0^, inserted into the loop 
cancel as above. We saw in Section 16.6 that the two-point functions in this 
gauge sum to 

= - k f, n [^_] (T(2—f) + (finite)), (19.164) 


where 3(g) = — bog 3 /(4ir) 2 . Following through the logic of the previous para¬ 
graph, we again find the result (19.158) with the identification of C as the 
first 3 function coefficient. 
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As with the axial vector anomaly, the trace anomaly can be found in 
many different ways. For each possible method of regulating a quantum field 
theory, there is a derivation of the trace anomaly that exploits the possible 
pathology of that particular regulator. For example, if one uses a Pauli-Villars 
regulator with heavy fermions to cancel the divergence of the QED fermion 
loop diagram, the heavy fermions contribute a term M'S'S to the trace of 
Q 1 ' 1 '. The loop diagram with this term inserted turns out to have a finite limit 
as M —1 oo, which precisely reproduces the trace anomaly. This computation 
is worked out in Problem 19.4. 

As with the axial vector anomaly, each derivation of the anomaly with 
a different regulator, taken individually, seems artificial, as if there were a 
problem with the field theory that we are not quite clever enough to fix. 
Eventually, though, we are forced to conclude that the quantum field theory 
is trying to tell us something. The anomalous symmetries of the classical 
theory cannot be promoted to symmetries of the quantum theory. Instead, 
the anomalous conservation laws require profound and qualitative changes in 
the theory from the classical to the quantum level. 

Problems 

19.1 Fermion number nonconservation in parallel E and B fields. 

(a) Show that the Adler-Bell-Jackiw anomaly equation leads to the following law 
for global fermion number conservation: If Nr and Nr are, respectively, the 
numbers of right- and left-handed massless fermions, then 

A N r - A N l = J ^' e • B 

To set up a solvable problem, take the background field to be A fl = (0, 0, Bx 1 , A), 
with B constant and A constant in space and varying only adiabatically in time. 

(b) Show that the Hamiltonian for massless fermions represented in the components 
(3.36) is 

H = J d :1 .r | irr • 1)• D)Ul] , 

with D * = V* — ieA‘ . Concentrate on the term in the Hamiltonian that involves 
right-handed fermions. To diagonalize this term, one must solve the eigenvalue 
equation —ier ■ Dbj; = Eif, >r. 

(c) The Pr eigenvectors can be written in the form 

/ _ ( bl(a-’ 1 A Mk-zx 2 +k 3 x 3 ) 

R 

The functions 4>\ and (f> 0, which depend only on a: 1 , obey coupled first-order 
differential equations. Show that, when one of these functions is eliminated, the 
other obeys the equation of a simple harmonic oscillator. Use this observation to 
find the single-particle spectrum of the Hamiltonian. Notice that the eigenvalues 
do not depend on A:o- 
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(d) If tlie system of fermions is set up in a box with sides of length L and periodic 
boundary conditions, the momenta ko and £3 will be quantized: 



By looking back to the harmonic oscillator equation in part (c), show that the 
condition that the center of the oscillation is inside the box leads to the condition 

k-i < eBL. 

Combining these two conditions, we see that each level found in part (c) has a 
degeneracy of 

eL 2 B 

2 tt 

(e) Now consider the effect of changing the background A adiabaticallv by an 
amount (19.37). Show that the vacuum loses right-handed fermions. Repeat¬ 
ing this analysis for the left-handed spectrum, one sees that the vacuum gains 
the same number of left-handed fermions. Show that these numbers are in accord 
with the global nonconservation law given in part (a). 

19.2 Weak decay of the pion. 

(a) In the effective Lagrangian for semileptonic weak interactions (18.28), the 
hadronic part of the operator is a left-handed current involving the u and d 
quarks. Show that this current is related to the quark currents of Section 19.3 
as follows: 

+ij » 2 -j fl51 -ij 1 * 52 ), 

where 1, 2 are isospin indices. Using this identification and (19.88), show that 
the amplitude for the decay w + —S- C + v is given by 

iM = G F fMq)^ ~ 'l/ 5 ) v (k'), 
where p, k, q are the momenta of the tt + , f' + , v. 

(b) Compute the decay rate of the pion. Show that this rate vanishes in the limit of 
zero lepton mass, and that the relative rate of pion decay to muons and electrons 
is given by 

r (7r + -l e + v) = (m e \ 2 (l-m 2 e /mD 2 = 4 

r(7T+ -4 ^+g| \m fl ) (l—m 2 /m 2 ) 2 

From the measured pion lifetime, tv = 2.6 x 10 -8 sec, and the pion and muon 
masses, m n = 140 MeV, m M = 106 MeV, determine the value of f w . 

19.3 Computation of anomaly coefficients. 

(a) Consider a product jq x rn of SU(n) representations, which is decomposed into 
irreducible representations as in (15.98). Using the explicit form of the generators 
given in (15.99), show that the anomaly coefficients satisfy 

did(ro) + do.4(ri) = E Ain). 
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(b) As we saw in Problem 15.5, the two-index symmetric and antisymmetric ten¬ 
sors form irreducible representations of SU(n ), which we will call s and a, re¬ 
spectively. In SU( 3), the representation a is three-dimensional. Show that it is 
equivalent to the 3. Compute the anomaly coefficients for a and s, making use 
of the identity in part (a). 

(c) Since SU(n) has a unique three-index symmetric tensor d abc which is already 
nonvanishing in an SU( 3) subgroup, we can compute the anomaly coefficient 
in SU(n ) by restricting our attention to three generators in this subgroup. By 
decomposing SU(n) representations into £1/(3) representations, compute the 
anomaly coefficients for a and s in SU(n) and derive Eq. (19.141). Find the 
anomaly coefficient of the j-index totally antisymmetric tensor representation 
of SU(n). Why does the result always vanish when 2 j = n? 

19.4 Large fermion mass limits. In the text, we derived the Adler-Bell-Jackiw and 
trace anomalies by the use of dimensional regularization. As an alternative, one could 
imagine deriving these results using Pauli-Villars regularization. In that technique, one 
regularizes the value of a fermion loop integral by subtracting the value of the same 
loop diagram computed with fermions of large mass M. The parameter M plays 
the role of the cutoff and should be taken to infinity at the end of the calculation. The 
anomalies arise because some pieces of the diagrams computed for very heavy fermions 
do not disappear in the limit M —> oo. These nontrivial M —> oo limits are interesting 
in their own right and can have physical applications (for example, in part (c) of the 
Final Project for Part III). 

(a) Show that the Adler-Bell-Jackiw anomaly equation is equivalent to the following 
large-mass limit of a fermion matrix element between the vacuum and a two- 
plioton state: 

Jin^ { <p,fc| 2 /Mf 0 ) } = Patt{p)k 0 e\(k). 

(b) Show that the trace anomaly, at one-loop order, is equivalent to the following 
large-mass limit: 

lim { {p, k\ M 'k'k 0 ) \ = + 7 -w[p ' k (*(p) ■ e*(k) — p ■ e* (k) k ■ e* (p)]. 

M—¥ oo l J 07 T~ 

(c) Show that the matrix element in part (a) is ultraviolet-finite before the M —> oo 
limit is taken. Evaluate the matrix element explicitly at one-loop order and verify 
the limit claimed in part (a). 

(d) The matrix element in part (b) has a potential ultraviolet divergence. However, 
show that the coefficient of (p ■ e*(k)k ■ e*(p)) is ultraviolet-finite, and that the 
rest of the expression is determined by gauge invariance. Compute the full ma¬ 
trix element using dimensional regularization as a gauge-invariant regulator and 
verify the result claimed in part (b). 



Chapter 20 


Gauge Theories with Spontaneous 
Symmetry Breaking 


In the course of this book, we have discussed three distinct fashions in which 
symmetries can be realized in a quantum field theory. The simplest case is 
a global symmetry that is manifest, leading to particle multiplets with re¬ 
stricted interactions. A second possibility is a global symmetry that is spon¬ 
taneously broken. Then, as discussed in Chapter 11,* the symmetry currents 
are still conserved and interactions are similarly restricted, but the vacuum 
state does not respect the symmetry and the particles do not form obvious 
symmetry multiplets. Instead, such a theory contains massless particles, Gold- 
stone bosons, one for each generator of the spontaneously broken symmetry. 
The third case is that of a local, or gauge, symmetry. As we saw in Chap¬ 
ter 15, such a symmetry requires the existence of a massless vector field for 
each symmetry generator, and the interactions among these fields are highly 
restricted. 

It is now only natural to consider a fourth possibility: What happens if we 
include both local gauge invariance and spontaneous symmetry breaking in 
the same theory? In this chapter and the next, we will find that this combina¬ 
tion of ingredients leads to new possibilities for the construction of quantum 
field theory models. We will see that spontaneous symmetry breaking requires 
gauge vector bosons to acquire mass. However, the interactions of these mas¬ 
sive bosons are still constrained by the underlying gauge symmetry, and these 
constraints can have observable consequences. 

In elementary particle physics, the principal application of spontaneously 
broken local symmetry is in the currently accepted model of weak interac¬ 
tions. This model, due to Glashow, Weinberg, and Salam, is introduced in 
Section 20.2. There we will see that it makes a number of precise and success¬ 
ful predictions for weak interaction phenomena. Remarkably, this model also 
unifies the weak interactions with electromagnetism in a single larger gauge 
theory. 


*Section 11.1 is necessary background for the present chapter, but the rest of 
Chapter 11 is not. 
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20.1 The Higgs Mechanism 

In this section we analyze some simple examples of gauge theories with spon¬ 
taneous symmetry breaking. We begin with an Abelian gauge theory, and 
then study several examples of non-Abelian models. 


An Abelian Example 


As our first example, consider a complex scalar field coupled both to itself 
and to an electromagnetic field: 


c = + \d,m 2 - vm (2o.i) 


with D ft = 8^ + ieA This Lagrangian is invariant under the local U( 1) 
transformation 

4>{x) -> e ia(x] 4>{x), A^(x) -> A^x) - ^d^a($). ( 20 . 2 ) 

If we choose the potential in C to be of the form 

V{<t>) =-/x 2 ^#+^*# 2 , (20.3) 

with /r > 0, the field will acquire a vacuum expectation value and the 17(1) 
global symmetry will be spontaneously broken. The minimum of this potential 
occurs at 

{4>) = 00 = (^) , (20.4) 

or at any other value related by the U(l) symmetry (20.2). 

Let us expand the Lagrangian (20.1) about the vacuum state (20.4). De¬ 
compose the complex field 0(x) as 


0{x) 


<Po + -^{(pi(x) +ifo(x)). 


The potential (20.3) is rewritten 

V(0) = ~2\f j4 + 2 ' 


(20.5) 


( 20 . 6 ) 


so that the field cpi acquires the mass m = \f2fi, and <p 2 is the massless 
Goldstone boson. So far, this whole analysis follows that in Section 11.1. 

But now consider how the kinetic energy term of cp is transformed. Insert¬ 
ing the expansion (20.5), we rewrite 


\D/j<p\~ — + V2e(po ■ A lt d l '0-2 + e“d>5 A^A fl + • • •, (20.7) 


where we have omitted terms cubic and quartic in the fields A^, <p i, and <p 2 - 
The last term written explicitly in (20.7) is a photon mass term 

A£ = yn 2 A A tl A^, 


( 20 . 8 ) 
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where the mass 

m 2 4 = 2e 2 ^ (20.9) 

arises from the nonvanishing vacuum expectation value of <f>. Notice that the 
sign of this mass term is correct; the physical spacelike components of A' J 
appear in (20.8) as 

A£ = -| m 2 A (A’f, 

with the correct sign for a potential energy term. 

In Chapter 7, and again in Chapter 16 for the non-Abelian case, we argued 
that a gauge boson cannot obtain a mass, unless this mass term is associated 
with a pole in the vacuum polarization amplitude. There is a counterexample 
to this result in two-dimensional spacetime; there, as we saw in Section 19.1, 
a pole of the required form can arise from the infrared singularity generated 
by a massless fermion pair. However, in four dimensions, a pole in the vac¬ 
uum polarization amplitude can be created only by a massless scalar particle. 
Typically, in situations with unbroken symmetry, no such particle is available. 

However, a model with a spontaneously broken continuous symmetry 
must have massless Goldstone bosons. These scalar particles have the quan¬ 
tum numbers of the symmetry currents, and therefore have just the right 
quantum numbers to appear as intermediate states in the vacuum polariza¬ 
tion. In the model we are now discussing, we can see this pole arise explicitly 
in the following way: The third term in Eq. (20.7) couples the gauge boson 
directly to the Goldstone boson c \> 2 ; this gives a vertex of the form 

= i\/2ed>o( —i£T) = m A k fl . (20.10) 

If we also treat the mass term (20.8) as a vertex in perturbation theory, then 
the leading-order contributions to the vacuum polarization amplitude give the 
expression 


= im 2 A (f v + {m A k^)-^{-m A k'') 


= im A ( g 


k>‘k v \ 

~k r )~ 


( 20 . 11 ) 


The Goldstone boson supplies exactly the right pole to make the vacuum 
polarization amplitude properly transverse. 

Although the Goldstone boson plays an important formal role in this 
theory, it does not appear as an independent physical particle. The easiest 
way to see this is to make a particular choice of gauge, called the unitarity 
gauge. Using the local U( 1) gauge symmetry (20.2), we can choose a(x) in 
such a way that <f>{x) becomes real-valued at every point x. With this choice, 
the field <j> 2 is removed from the theory. The Lagrangian (20.1) becomes 


C. = -i(i^) 2 + (3^) 2 + e 2 (tPA IJ A>' - V{cfy. (20.12) 
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If the potential V (<p) favors a nonzero vacuum expectation value of <p, the 
gauge field acquires a mass; it also retains a coupling to the remaining, physical 
field <j> i- 

This mechanism, by which spontaneous symmetry breaking generates a 
mass for a gauge boson, was explored and generalized to the non-Abelian case 
by Higgs, Kibble, Guralnik, Hagen, Brout, and Englert, and is now known 
as the Higgs mechanism. However, this mechanism had an earlier application 
to the theory of superconductivity. In Chapter 8, we constructed the Landau 
description of a second-order phase transition. To describe a superconductor, 
Landau and Ginzburg coupled this theory to an external electromagnetic field; 
they obtained precisely the Lagrangian (20.1). Since the gauge field acquires a 
nonzero mass, external electromagnetic fields penetrate a superconductor only 
to the depth m~^. This explains the Meissner effect , the observed exclusion 
of macroscopic magnetic fields from a superconductor. 

The role of the Goldstone boson in the Higgs mechanism is intricate, 
and seems mysterious at this level of the discussion. We first saw that the 
involvement of the Goldstone boson is necessary, as a matter of principle, in 
order for the gauge boson to acquire a mass. We then saw that the Goldstone 
boson can be formally eliminated from the theory. However, we might argue 
that the Goldstone boson has not completely disappeared. A massless vector 
boson has only two physical polarization states; we saw in Chapter 16 that 
the longitudinal polarization state cannot be produced, and appears in the 
formalism only to cancel other unphysical contributions. However, a massive 
vector boson must have three physical polarization states: In its rest frame, 
it is a spin-1 object, which can make no distinction between transverse and 
longitudinal polarizations. It is tempting to say that the gauge boson acquired 
its extra degree of freedom by eating the Goldstone boson. In Sections 21.1 
and 21.2 we will clarify this picture, by studying the quantization and gauge¬ 
fixing of spontaneously broken gauge theories. 

Systematics of the Higgs Mechanism 

The Higgs mechanism extends straightforwardly to systems with non-Abelian 
gauge symmetry. It is not difficult to derive the general relation by which a 
set of scalar field vacuum expectation values leads to the appearance of gauge 
boson masses. Let us work out this relation and then apply it in a number of 
examples. 

Consider a system of scalar fields <f> t that appear in a Lagrangian invariant 
under a symmetry group G, represented by the transformation 

(pi > (1 + in' 1 1' 1 (20.13) 

It is convenient to write the (pi as real-valued fields, for example, writing n 
complex fields as 2 n real fields. Then the group representation matrices t a 
must be pure imaginary and, since they are Hermitian, antisymmetric. Let us 
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write 

Hi (20.14) 

so that the T a are real and antisymmetric. 

If we promote the symmetry group G to a local gauge symmetry, the 
covariant derivative on the <pi is 

D,cp = (d„ - igA$t a )4> = {d, + g.')<>■ 

Then the kinetic energy term for the <j>i is 

\{D^f = + gAlid^T^j) + \g 2 A a ^A b ^(T a 4>) i {T b 4>) i . (20.15) 

Now let the <f>i acquire vacuum expectation values 

(<t>i) = (<f>o)ii (20.16) 

and expand the d>, about these values. The last term in Eq. (20.15) contains 
a term with the structure of a gauge boson mass, 

A£ = \ml h A%A b *\ (20.17) 

with the mass matrix 

mi b =g 2 (T a 4> 0 ) i (T b <f> 0 )i, (20.18) 

This matrix is positive semidefinite, since any diagonal element, in any basis, 
has the form 

m 2 aa = g 2 (T a <t>o)' 2 > o (no sum). 

Thus, generically, all of the gauge bosons will receive positive masses. However, 
it may be that some particular generator T a of G leaves the vacuum invariant: 

T a <p o = 0. 

In that case, the generator T a will give no contribution to (20.18), and the 
corresponding gauge boson will remain massless. 

As in the Abelian case, the gauge boson propagator receives a contribution 
from the Goldstone bosons, which is necessary to make the vacuum polariza¬ 
tion amplitude transverse. To compute this contribution, we need the vertex 
that mixes gauge bosons and Goldstone bosons. This comes from the second 
term of the Lagrangian (20.15). When we insert the vacuum expectation value 
of the scalar field (20.16), this term becomes 

A C = gAld,MT a <Po)i- (20.19) 

This interaction term does not involve all of the components of (j >—only those 
that are parallel to a vector T Q (p 0 for some choice of T a . These vectors repre¬ 
sent the infinitesimal rotations of the vacuum; thus the components </>,; that 
appear in (20.19) are precisely the Goldstone bosons. Using the fact that these 
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bosons are massless, we can compute the counterpart, for the non-Abelian 
case, of the Goldstone boson diagram in Eq. (20.11): 

= ^{gk>(T a <M J )-^{- g k l '(T%) J ). ( 20 . 20 ) 

j 

The sum runs over those components j with a nonzero projection onto the 
space spanned by the T a (j> 0 , or equally well, over all j. This diagram is there¬ 
fore proportional to the mass matrix (20.18). Combining this expression with 
the contribution to the vacuum polarization from (20.17), we find a properly 
transverse result, 


where mr ab is given by Eq. (20.18). 


Non-Abelian Examples 

Let us now apply this general formalism to some specific examples of non- 
Abelian gauge theories. Consider first a model with an 517(2) gauge field 
coupled to a scalar field that transforms as a spinor of 517(2). The covariant 
derivative acting on q> is 


= (8, - in Air" ux ( 20 . 22 ) 

where r a = cr a /2. The square of this expression is the scalar field kinetic 
energy term. 

If 5 acquires a vacuum expectation value, we can use the freedom of 517(2) 
rotations to write this expectation value in the form 

W = -L(»). (20.23) 

Then the gauge boson masses arise from 

\D„4>\' 2 = y 2 {0 v) T a T b A“A bt ‘ + ■■■. (20.24) 


We can symmetrize the matrix product under the interchange of a and b\ 
using {t° , T b } = T;S ab , we find the mass term 


q~V~ 

A£ = d— 


(20.25) 


All three gauge bosons receive the mass 


gv 

mA = y. 


(20.26) 


signaling that all three generators of 517(2) are broken equally well by (20.23). 
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Figure 20.1. The space of configurations for a scalar field in the vector 
representation of SU( 2). When the SU( 2) symmetry is spontaneously bro¬ 
ken, the allowed vacuum states lie on a spherical surface. If the vacuum 
expectation value <p o lies in the 3 direction, then the generator T 3 leaves <p o 
unchanged, while T 1 and T 2 rotate cpo in the directions shown. 

What if we had taken <j> to transform according to the vector representa¬ 
tion of SU{ 2)? If we take cp to be a real-valued vector under SU(2), we must 
assign it the covariant derivative 

{D^0)a = d^0 a + gtabcA^c. (20.27) 

Again, the cp kinetic energy term is the square of this object, and so, if cp 
acquires a vacuum expectation value, we find the gauge boson mass term 

= |M^ 0 ) c ) 2 + • • •. (20.28) 

If a vector of SU(‘2) acquires an expectation value cpo, we can choose 
our coordinates so that this vector points in any particular direction in the 
internal space. We will choose it to point in the 3 direction, as indicated in 
Fig. 20.1: 

(<pc) = (<Po)c = VS c3 . (20.29) 

Inserting (20.29) into (20.28), we find 

A£ = 9 -V 2 {e am A b ^f = 9 -V\{Alf + (A;f). (20.30) 

The gauge bosons corresponding to the generators 1 and 2 acquire masses 

mi = mo = gV, (20.31) 

while the boson corresponding to the generator 3 remains massless. It is easy 
to see the reason for this distinction by glancing at Fig. 20.1. The vacuum 
expectation value of cp c destroys the symmetry of rotation about the axes 1 
and 2, but it preserves the symmetry of rotation about the 3 axis. As we saw 
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in our general analysis, gauge bosons corresponding to unbroken symmetry 
generators remain massless. 

It is interesting that this model contains both massive and massless gauge 
bosons, with the distinction between these bosons created by spontaneous 
symmetry breaking. If we interpret the massive bosons as W bosons and the 
massless gauge boson as the photon, it is tempting to interpret this theory as 
a unified model of weak and electromagnetic interactions. Georgi and Glashow 
proposed this model as a serious candidate for the theory of weak interactions.^ 
However, Nature chooses a different model, which we will discuss in the next 
section. 

We turn next to a more complicated example. Consider an SU( 3) gauge 
theory with a scalar field in the adjoint representation. The covariant deriva¬ 
tive of (f> takes the form 


D^cpa = du<t>a + dfabcAptpc, (20.32) 

and so the gauge field masses arise from the term 

A C = 9 -{f abc A b ll 4> c )\ (20.33) 

We can write this more clearly by defining the quantity 

$ = <M C , (20.34) 

where t c are the 3x3 traceless Hermitian matrices that represent the genera¬ 
tors of SU( 3). Using this notation and the definition (15.68) of the structure 
constants, we can rewrite the mass term (20.33) as 

AC = -g 2 tr[[t a ,<f>][t b ,<!>}]AlA b ». (20.35) 

Now let $ acquire a vacuum expectation value 

{$) = $ 0 - (20.36) 


Since $ 0 is a traceless Hermitian matrix, we should analyze its effects by 
diagonalizing it. In principle, <J> 0 could have three arbitrary eigenvalues that 
sum to zero. However, when one minimizes explicit potential energy functions, 
one often finds expectation values that preserve some of the original symmetry. 
We will consider two examples. 

First, 4>o might have the orientation 


$ 0 = H • 



(20.37) 


tH. Georgi and S. L. Glashow, Phys. Rev. Lett. 28, 1494 (1972). 
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This matrix commutes with the four SU( 3) generators 


t a 


T a 0 

0 0 





(20.38) 


Thus, the expectation value (20.37) breaks SU( 3) spontaneously to SU( 2) x 
U( 1) and leaves the gauge bosons corresponding to these four generators mass¬ 
less. The remaining four generators of SU( 3), 


t 4 

t 6 



0 

0 

0 

0 

0 

1 




acquire the masses 

m 2 = (3g\cf>\) 2 , 

as one can check by substituting these matrices into Eq. (20.35). 
Another possible orientation for <3> 0 is 


$ 0 = \<t>\ • 



(20.39) 


(20.40) 


(20.41) 


In this case, only t 3 and t s commute with 4> 0 , so the original SU( 3) symmetry 
is broken down to U(l)xU(l). By substituting into (20.35), one can determine 
that the gauge bosons corresponding to the remaining generators of SU( 3) 
acquire the masses 


r,r 


m 2 = 




m 2 = (cM)' 2 . (20.42) 


Still larger symmetry groups offer a wider variety of symmetry-breaking 
patterns, and more complex mass matrices. We consider one further example 
in Problem 20.1. 


Formal Description of the Higgs Mechanism 

Up to this point, our study of the Higgs mechanism has been based on the ex¬ 
plicit analysis of scalar field Lagrangians coupled to gauge fields. Scalar field 
theories provide the simplest examples of systems with spontaneous symmetry 
breaking, and the explicit calculations they allow are useful for visualization. 
But symmetries can be broken in other ways. In the theory of superconductiv¬ 
ity, for example, the Abelian gauge invariance of electromagnetism is broken 
by pairs of electrons that condense in the ground state of a metal. In Sec¬ 
tion 19.3, we argued that, in the approximation that quark masses are very 
small, QCD possesses global symmetries that are spontaneously broken by a 
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condensation of quark-antiquark pairs. In these examples, spontaneous sym¬ 
metry breaking is the result of strong interactions beyond perturbation theory. 
We would like to understand whether these more general mechanisms of spon¬ 
taneous symmetry breaking can also give mass to vector bosons, and, if so, 
how the masses can be calculated. 

To carry out this analysis, we will need to abstract several ideas from 
the preceding discussion. First, we will discuss in general terms the relations 
between gauge bosons, Goldstone bosons, and global symmetry currents. Then 
we will use this information to construct the gauge boson mass matrix without 
making direct use of the Lagrangian. 

Consider, first, an arbitrary quantum field theory Co with a global sym¬ 
metry G. In Section 9.6, we derived the Noether current corresponding to the 
G symmetry by varying the Lagrangian by a local gauge transformation with 
infinitesimal parameter ot a (x). Transforming with a constant a a should leave 
Co unchanged. Then the more general variation of Co must take the form 

5C 0 = -(d IJ a a ).P‘ a , (20.43) 

for some set of vector operators J ,,a built from the fields of Co- The variational 
principle then tells us that 

<9,, J'“ = 0. (20.44) 

We can identify the as the Noether currents of the global gauge symmetry. 

We can now couple this globally symmetric theory to non-Abelian gauge 
fields, promoting the global symmetry to a local symmetry. To first order in g, 
the new Lagrangian should take the form 

C = Co+gA^J^+ 0{A 2 ). (20.45) 

To check this, note that the transformation (20.43) compensates the varia¬ 
tion due to a gauge transformation of A“, Eq. (15.46), to leading order in g. 
The terms of order A 2 and higher can in general be arranged to compensate 
the higher-order terms in the gauge transformation. Thus, matrix elements in¬ 
volving only one insertion of the gauge field can be evaluated using properties 
of the Noether currents of the original globally symmetric theory. Note in par¬ 
ticular that the conservation law for these currents, Eq. (20.44), guarantees 
that the Ward identities for these matrix elements are satisfied. 

If the global symmetry of the theory Co is spontaneously broken, this 
theory will contain Goldstone bosons, which will stand in a special relation 
to the Noether currents. At long wavelength, the Goldstone bosons become 
infinitesimal symmetry rotations of the vacuum, Q a |0), where Q a is the global 
charge associated with J ,ia . Thus, the operators J f,a have the correct quantum 
numbers to create Goldstone bosons from the vacuum. Let |7Tfc) denote a 
Goldstone boson state. In general, there will be a current J ,ia that can create 
or destroy this boson; we can parametrize the corresponding matrix element 


as 


<0| J» a (x) |t T k (p)) = -ijPF^e-^ 


(20.46) 
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where p M is the on-shell momentum of the boson and F a k is a matrix of 
constants. The elements F% vanish when a denotes a generator of an unbroken 
symmetry. Then the nonvanishing matrix elements of F% connect the currents 
of the spontaneously broken symmetries to their corresponding Goldstone 
bosons. Since the currents J m are conserved, we find 

0 = <0| J'“( x) |t r k (p)) = -irF a k e-‘P- x , (20.47) 

which implies that the bosons with nonzero matrix elements (20.46) satisfy 
p 2 = 0 on shell and so are massless. This is another proof of Goldstone’s 
theorem.+ 

Since the scalar field theory that we examined earlier in this section should 
be a special case of this analysis, we should find there an example of the 
relation given in Eq. (20.46). Comparing Eqs. (20.15) and (20.45), we see 
that, for the scalar field theory, 

•/'" (20.48) 

which is indeed the Noether current corresponding to the global gauge sym¬ 

metry. Inserting the vacuum expectation value (20.16), we find 

(20-49) 

which leads to the set of matrix elements 

(0 ./"■'(./•) I Up)) = //'"(/•■'Ou),. (20.50) 

Using this relation, we can identify 

f% = r ; y, oj (20.51) 

for the Higgs mechanism in a weakly coupled scalar field theory. To be more 
precise, the index i runs over all components of the scalar field cp. However, 
we saw in the discussion below Eq. (20.19) that (20.51) is nonzero only for 
components <pi that are Goldstone bosons, and only for symmetry generators 
a that are spontaneously broken. Thus the nonzero components of (20.51) 
form precisely the structure (20.46). 

As a concrete illustration of the way that the objects T a <j> 0 link spon¬ 
taneously broken generators and Goldstone bosons, consider the situation of 
SU(‘2) symmetry broken by a scalar field in the vector representation, as in 
Eq. (20.29) and Fig. 20.1. According to the figure, rotations about the 1 axis 
tip the vacuum expectation value of 4> into the 2 direction, rotations about 
the 2 axis tip this expectation value into the 1 direction, and rotations about 
3 leave {< f >) invariant. Thus the gauge generators T 1 and T 2 are spontaneously- 
broken, and the scalar field components cp 2 and (p 1 are the corresponding 
Goldstone bosons. This accords with the result of computing the elements of 
(T a (f> 0 )i explicitly: Using {T a ) bc = e bac , we find 

{T a (j)o) b = e ba c (4> c ) = V e ba3 . (20.52) 


1A special case of this argument appeared in the discussion of Eq. (19.88). 
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Inserting this result into formula (20.50), we see that the current of each spon¬ 
taneously broken symmetry creates and destroys its own Goldstone boson. 

Now we can use this formalism to study the working of the Higgs mecha¬ 
nism in this general context. Consider the original theory £ 0 coupled to gauge 
bosons of G. To see how the Higgs mechanism operates, we must compute the 
vacuum polarization amplitude. This amplitude is required by the Ward iden¬ 
tity to be transverse, so it is necessarily of the form 

/ hMh v \ 

= i(9^-^r)-(™l b + 0(k- 2 )). (20.53) 

It is not easy to compute the nonsingular terms in (20.53) in this general 
situation, but it is straightforward to compute the singular term, which comes 
from contributions with an intermediate Goldstone boson. Combining Eqs. 
(20.45) and (20.46), we see that the amplitude for a gauge boson to convert 
to a Goldstone boson is 

= -gk»F a j. (20.54) 

Then the pole contribution to the vacuum polarization is 

= (gk»F a j )^(-gk>'F b j ). (20.55) 

Comparing (20.55) with (20.21), we identify 

ml b =g*F a j F b j . (20.56) 

Notice that, in the case in which the symmetry is broken by a scalar field, 
this result reverts to (20.18). However, Eq. (20.56) applies to any theory of 
spontaneously broken symmetry, whether the symmetry breaking is apparent 
from the Lagrangian or whether it requires strong interactions or other non- 
perturbative effects. It is a general result, then, that any gauge boson coupled 
to the current of a spontaneously broken symmetry acquires a mass. 

20.2 The Glashow-Weinberg-Salam Theory 
of Weak Interactions 

We are now ready to write down the spontaneously broken gauge theory 
that gives the experimentally correct description of the weak interactions, a 
model introduced by Glashow, Weinberg, and Salam (GWS). Like the second 
SU( 2) model considered in the previous section, this model gives a unified 
description of weak and electromagnetic interactions, in which the massless 
photon corresponds to a particular combination of symmetry generators that 
remains unbroken. 

Again we begin with a theory with SU( 2) gauge symmetry. To break the 
symmetry spontaneously, we introduce a scalar field in the spinor represen¬ 
tation of 51/(2), as in Eq. (20.22). However, we know that this theory leads 
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to a system with no massless gauge bosons. We therefore introduce an addi¬ 
tional 17(1) gauge symmetry. We assign the scalar field a charge +1/2 under 
this 17(1) symmetry, so that its complete gauge transformation is 

(20.57) 

(Here r“ = cr“/2.) If the field </> acquires a vacuum expectation value of the 
form 

W = + (»). (20.58) 

then a gauge transformation with 

a 1 = a 2 = 0, a 3 = j3 (20.59) 

leaves (<j>) invariant. Thus, the theory will contain one massless gauge boson, 
corresponding to this particular combination of generators. The remaining 
three gauge bosons will acquire masses from the Higgs mechanism. 


Gauge Boson Masses 


It is straightforward to work out the details of the mass spectrum by using 
the methods of the previous section. The covariant derivative of <p is 


= (0„ - igA“T a - i\g'B „)«/>, (20.60) 

where and are, respectively, the 517(2) and 17(1) gauge bosons. Since 
the 517(2) and 17(1) factors of the gauge group commute with one another, 
they can have different coupling constants, which we have called g and g'. 

The gauge boson mass terms come from the square of Eq. (20.60), evalu¬ 
ated at the scalar field vacuum expectation value (20.58). The relevant terms 
are 

AT = \ ( 0 t,) (gA% r a + l -g'B») (gA b » r b + \g'B») ( . (20.61) 


If we evaluate the matrix product explicitly, using r“ = a a / 2, we find 

AT = l l ii9 2 (Al) 2 + g 2 (^) 2 + (-gAl + g'B,y 2 ]. (20.62) 

There are three massive vector bosons, which we will notate as follows: 


7 U — 
Z K - 


1 V 

Wft = i A l T T4-) with mass m w = g ^5 

1 

{g A l - g'B u ) with mass m z = Vg 2 + g' 2 -■ 


(20.63) 




The fourth vector field, orthogonal to Z° remains massless: 


An = 




(g'A 3 fl + gB fl ) with mass m,A = 0. 


(20.64) 
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We will identify this field with the electromagnetic vector potential. 

^From now on it will be more convenient to write all expressions in terms 
of these mass eigenstate fields. Consider, for instance, the coupling of the 
vector fields to fermions. For a fermion field belonging to a general SU( 2) 
representation, with U( 1) charge Y, the covariant derivative takes the form 


D„ = d, - igA“ l T a - ig'YB (20.65) 
In terms of the mass eigenstate fields, this becomes 


J7_ 

V2' 


D„ = d„ - i-2= (W+T+ + W~T~) - 


VFT, 


fZp (g 2 T 3 — g r2 Y) 


— I 


99 


Vg 2 + g 12 


A fl (T 3 + Y ), 


( 20 . 66 ) 


where T ± = (T 1 ± iT 2 ) . The normalization is chosen so that, in the spinor 
representation of SU( 2), 


T+ = ^(a 1 ± icr 2 ) = (J+. 


(20.67) 


The last term of Eq. (20.66) makes explicit the fact that the massless gauge 
boson Ap couples to the gauge generator (T 3 + Y), which generates precisely 
the symmetry operation (20.59). 

To put expression (20.66) into a more useful form, we should identify the 
coefficient of the electromagnetic interaction as the electron charge e, 


e = 


gg 1 

\Jg 2 +g 12 


( 20 . 68 ) 


and identify the electric charge quantum number as 


Q = T 3 + Y. 


(20.69) 


These substitutions, with Q = — 1 for the electron, give the conventional form 
of the coupling of the electromagnetic field. 

To simplify expression (20.66) further, we define the weak mixing angle , 
0 W , to be the angle that appears in the change of basis from ( A 3 , B ) to (Z°,A): 


that is, 


/ Z° \ _ f cos 9 W — sin 9, w \ / zi 3 \ 

\ A ) ^ sin 9 W cos 9 W J y B J ’ 


cos#,,, = 


s/^T: 


sin#,,, = 


n/TT 


Then, with the manipulation in the Z° coupling 

g 2 T 3 — g l2 Y = (g 2 + g r2 )T 3 — g l2 Q, 


(20.70) 
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we can rewrite the covariant derivative (20.66) in the form 


D »=d,-i^=(W+ T+ + W- T ~) 


cos 9, 


-Z t j {T 3 


where 


9 = 


e 

sin 9 W 


sin 2 9 W Q) - ieAf,Q, 

(20.71) 

(20.72) 


We see here that the couplings of all of the weak bosons are described by 
two parameters: the well-measured electron charge e, and a new parameter 9 W . 
The couplings induced by W and Z exchange will also involve the masses of 
these particles. However, these masses are not independent, since it follows 
from Eqs. (20.63) that 

m\v = mz cos 9 W . (20.73) 


All effects of W and Z exchange processes, at least at tree level, can be written 
in terms of the three basic parameters e, 9 W , and m\y. 


Coupling to Fermions 

The covariant derivative (20.71) uniquely determines the coupling of the W 
and Z° fields to fermions, once the quantum numbers of the fermion fields are 
specified. To determine these quantum numbers, we must take account of the 
fact, mentioned in Section 17.3, that the W boson couples only to left-handed 
helicity states of quarks and leptons. 

At the level of the classical Lagrangian, there is no difficulty in construct¬ 
ing theories in which the left- and right-handed components of a fermion field 
couple differently to gauge bosons.* Already in Section 3.2 we saw that the 
kinetic energy term for Dirac fermions splits into separate pieces for the left- 
and right-handed fields: 

flW = ■/. + ip R i0ipR- (20.74) 

When we couple ip to a gauge field, we can assign ipL and ipR to different 
representations of the gauge group. Then the two terms on the right-hand 
side of (20.74) will contain two different covariant derivatives, and these will 
imply two different sets of couplings. 

In the GWS model, we can use this technique to insure that only the left- 
handed components of the quark and lepton fields couple to the W bosons. We 
assign the left-handed fermion fields to doublets of SU(‘2), while making the 
right-handed fermion fields singlets under this group. Once we have specified 
the T 3 value for each fermion field, the value of Y that we must assign follows 
from Eq. (20.69). This means that the Y assignments will also be different for 


*In Section 19.4, we argued that there is a possible problem with this strategy at 
the level of quantum corrections. We will check below whether the specific model we 
construct avoids this problem. 
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the left- and right-handed components of quarks and leptons. For the right- 
handed fields, T 3 = 0, and so we reproduce the standard electric charges by 
assigning I' to equal the electric charge. For example, for the right-handed u 
quark field, Y = +2/3; for ej f , Y = —1. For the left-handed fields, 

El = (?)g Ot = 00+ (2075) 

the assignments I' = —1/2 and + = +1/6, respectively, combine with T 3 = 
±1/2 to give the correct electric charge assignments. Since the left- and right- 
handed fermions live in different representations of the fundamental gauge 
group, it is often useful to think of these components as distinct particles, 
which are mixed by the fermion mass terms. 

In fact, the construction of fermion mass terms is a serious problem, be¬ 
cause all possible such terms are forbidden by global gauge invariances. For 
example, we cannot write an electron mass term 

AC = -m e (e L e R +e R e L ), (20.76) 

because the fields and en belong to different SU( 2) representations and 
have different U(l) charges. For the next few pages, we will ignore this prob¬ 
lem by treating all fermion fields as massless. This description will suffice to 
analyze the structure of the weak interactions at high energies, where the 
quark and lepton masses can be ignored. At the end of this section we will 
return to the problem of writing quark and lepton mass terms in the GWS 
theory. The solution to this problem will reinforce the idea that the left- and 
right-handed fermion fields are fundamentally independent entities, mixed to 
form massive fermions by some subsidiary process. 

If we ignore fermion masses, the Lagrangian for the weak interactions of 
quarks and leptons follows directly from the charge assignments given above. 
The fermion kinetic energy terms for e, v, u, and d are 

£ = E L (ift)E L + en(ilp)eR + Q l ( j U/>)Ql + ur(HP)ur + d.R(ip)dR. (20.77) 

In each term, the covariant derivative is given by Eq. (20.65), with T a and Y 
evaluated in the particular representation to which that fermion field belongs. 
For example, 

Q L mQL = Q L i 7" (8„ - igA“r* - i\g'B,)Q L . (20.78) 

A right-handed neutrino would have zero coupling both to SU( 2) and to 17(1), 
so we have simply omitted this field from Eq. (20.77). 

To work out the physical consequences of the fermion-vector boson cou¬ 
plings, we should write Eq. (20.77) in terms of the vector boson mass eigen¬ 
states, using the form of the covariant derivative given in Eq. (20.71). Equation 
(20.77) then takes the form 

£ = E L (i@)E L + eR{i$)eR + Q l (i^)Ql + ur(i^)ur + d.R(i<fi)d.R 

+ g{ w » J w + •/(;• + K J z) + ^ 4 m , 


(20.79) 
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where 


Jw = -^j={e L YvL +d L ^u L )-, 

■J f z = ~q~ [vLY{h)vL + e L Y (~h + sin 2 0 w )e L + e R Y (sin 2 0 w )e R 
+ u L . 7 " (7 - | sin 2 0 w )u L + + 0 ^ 7 " (-§ sin 2 
+ d L 7 " (-7 + | sin 2 0 w )d L + d fi 7 " (| sin 2 0 w )d R 
J em = er{-l)e + ur{+l)u + dY‘{~±)d. ( 20 . 80 ) 


Here we have used Eq. (20.67) to simplify the W boson currents. Notice 
that the current J^ M associated with the photon field is indeed the standard 
electromagnetic current. 


Anomaly Cancellation 

As we have just seen, there is no difficulty in writing a Lagrangian that cou¬ 
ples the GWS gauge bosons to fermions in a chiral fashion. However, these 
chiral couplings do present a potential problem that appears at the level of 
one-loop corrections. In Section 19.2, we saw that an axial current that is con¬ 
served at the level of the classical equations of motion can acquire a nonzero 
divergence through one-loop diagrams that couple this current to a pair of 
gauge bosons. The Feynman diagram that contains this anomalous contribu¬ 
tion is a triangle diagram with the axial current and the two gauge currents 
at its vertices. In a gauge theory in which gauge bosons couple to a chiral 
current, the dangerous triangle diagrams appear in the one-loop corrections 
to the three-gauge-boson vertex function. The anomalous terms violate the 
Ward identity for this amplitude. Thus, as we argued in Section 19.4, theo¬ 
ries in which gauge bosons couple to chiral currents can be gauge invariant 
only if the anomalous contribution somehow disappears. Fortunately, as we 
saw there, the anomalous terms can be arranged to cancel when one sums 
over all possible fermion species that can circulate in these diagrams.i 

Within the GWS theory, the requirement from experiment that the weak 
interaction currents are left-handed forced us to choose a chiral gauge cou¬ 
pling. Now we must check that the anomalous terms from the triangle dia¬ 
grams cancel as required. We will find that they do, but only through a subtle 
and rather magical interplay of the quantum numbers of quarks and leptons. 

The anomalous term of a triangle diagram of three gauge bosons A®, A h v , 


Af you have not read Chapter 19, but you are willing to assume that the fermion 
triangle diagram contains a contribution that violates gauge invariance, you should 
still be able to follow the argument that follows. 
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and A c x is proportional to the group theoretic invariant 

tr [ 7 5 t a {t b ,t c }], (20.81) 

where the trace is taken over all fermion species. The anticommutator comes 
from taking the sum of two triangle diagrams in which the fermions circle in 
opposite directions. The factor 7 0 registers the fact that the anomaly is asso¬ 
ciated with chiral currents; this factor equals —1 for left-handed fermions and 
+ 1 for right-handed fermions. In theories such as QED or QCD in which the 
gauge bosons couple equally to right- and left-handed species, the anomalies 
automatically cancel. This bookkeeping method is a special case of the more 
general method presented in Section 19.4. 

To evaluate the anomalies of the GWS theory, it is easiest to work in the 
basis of SU( 2 ) x U( 1) gauge bosons, before the mixing to the photon and 
Z° mass eigenstates. It suffices to evaluate the triangle diagrams for massless 
fermions, so that right- and left-handed fermions have distinct quantum num¬ 
bers. However, we must consider not only the anomalies of diagrams with three 
SU(‘2) x U( 1) gauge bosons, but also diagrams with both weak-interaction 
gauge bosons and color SU( 3) gauge bosons of QCD. If we consider effects of 
gravity on the weak-interaction gauge theory, there is also a possibly anoma¬ 
lous diagram with one weak-interaction gauge boson and two gravitons. We 
can omit diagrams, such as the anomaly of three SU (3) bosons or of one SU( 3) 
boson and two gravitons, in which all of the couplings are left-right symmet¬ 
ric. Then the full set of diagrams with possible anomalous terms is shown in 
Fig. 20.2. All of the possible anomalies must cancel if the Ward identities of 
the SU( 2) x U( 1) gauge theory are to be satisfied. 

It is a special property of SU( 2) gauge theory that the anomaly of three 
SU( 2 ) gauge bosons always vanishes; this result follows from the property of 
Pauli sigma matrices { rj a ,cr b } = '2S ab , which implies that the trace (20.81) 
vanishes. The anomalies containing one SU( 3) boson or one SU( 2) boson are 
proportional to 

tr[7' 1 ] =0 or tr[r“] = 0 . (20.82) 

The remaining nontrivial anomalies are those of one U( 1) boson with two 
SU( 2) bosons or two SU( 3) bosons, the anomaly of three U( 1) bosons, and 
the gravitational anomaly with one U( 1) gauge boson. 

The anomaly of one U( 1) boson with two SU( 3) bosons is proportional 
to the group theory factor 

tr [t a t h Y] = ±5 ah ■ y 9 > (20.83) 

Q 

where the sum runs over left-handed quarks and right-handed quarks, with an 
extra ( — 1) for the left-handed contributions. Inserting the charge assignments 
given above for ul , dt, ur, and d,R, we find 

E^ =- 2 -1+ (!) + (—1) = °- 

Q 


(20.84) 
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Figure 20.2. Possible gauge anomalies of weak interaction theory. All of 
these anomalies must vanish for the Glashow-Weinberg-Salam theory to be 
consistent. 

Similarly, the anomaly of a 17(1) boson with two 517(2) bosons is proportional 
to 

tr [T a r b Y] = hS ab J2 Y f l, (20.85) 

fL 

where the sum runs over the left-handed fermions El and Ql • Thus, 

5 >/i = -H)- 3 -F = °i (20.86) 

fL 

the factor 3 counts the color states of the quarks. The anomaly of three 17(1) 
gauge bosons is proportional to a sum involving left- and right-handed leptons 
and quarks: 

tr[T' 3 ] = -2(-±) 3 + (-1) 3 - 3[2(i) 3 - (§) 3 - (-f) 3 ] = 0. (20.87) 

Finally, the gravitational anomaly with one 17(1) gauge boson is proportional 
to 

tr[T] = -2(-|) + (-1) - 3[2(±) - (|) - (-|)] = 0. (20.88) 

The Glashow-Weinberg-Salam theory is thus a chiral gauge theory that 
is completely free of axial vector anomalies among the gauge currents. How¬ 
ever, the cancellation of anomalies requires that leptons and quarks appear 
in complete multiplets with the structure of (EL,eR,QL,UR,dft). This set of 
fields is often called a generation of quarks and leptons. The consistency of the 
theory requires that quarks and leptons appear in Nature in equal numbers, 
organizing themselves into successive generations in this way. 

Experimental Consequences of the GWS Theory 

Now that we have a fundamental theory for the coupling of W and Z bosons 
to fermions, we can work out the consequences of this theory for observable 
processes mediated by weak bosons. This analysis should reproduce the ef¬ 
fective Lagrangian description of the weak interactions used in Chapters 17 
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Figure 20.3. Some processes with virtual W and Z boson exchange. 


and 18, and also predict additional observable effects of weak boson exchange. 
In our discussion here, we will derive only the most basic relations in this sub¬ 
ject; we do not have space for a systematic survey of the phenomenology 
of weak interactions. However, we encourage the reader to study the exper¬ 
imental foundations of the weak interactions, which contain many beautiful 
illustrations of the principles of quantum field theory.* 

At energies low compared to the vector boson masses, the couplings of the 
weak bosons have their major effects through processes that involve virtual 
weak boson exchange. These processes are shown in Fig. 20.3. We will derive 
the Feynman rules for massive gauge bosons in Chapter 21. Meanwhile, it is 
reasonable to guess that the W and Z boson propagators are given by 

_ ; n tt* —in 1 "' 

(W^(p)W’'-(-p)) = „ 9 3 , (Z^p)Z-'(-p)) = -^-L. (20.89) 

p- — m\ v p i — m z 

We will see in Section 21.1 that these propagators give correct expressions for 
diagrams with W and Z exchange up to terms of order (■ mf/mw ), where mj 
is a fermion mass. 

First consider the W exchange diagram in Fig. 20.3, in the limit of energies 
low compared to the W mass. We can then neglect the p 2 term in the denom¬ 
inator of the W propagator (20.89). Taking the W coupling from Eq. (20.79), 
we find that the diagram can be described by the effective Lagrangian 


A Cw = 


m 


T+ 

2 J W J fiW 
W 
2 


2m 2 (e L YvL + d L Yu L ) {vLl[,.e L + • 


w 

The coefficient is often written in terms of the Fermi constant , 

G f _ g 2 
72 


8 ' 


(20.90) 


(20.91) 


The various terms in this effective Lagrangian reproduce the expressions we 
have already written in Eqs. (17.31), (18.28), and (18.29). Since these in¬ 
teractions among leptons and quarks are mediated by the exchange of an 


+ Tlie experimental successes of tlie theory of weak interactions are reviewed in 
the book of Commins and Bucksbaum (1983). 
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electrically charged vector boson, they are called collectively charged-current 
interactions. The effective Lagrangian (20.90) turns out to provide an impres¬ 
sively successful description of the phenomenology of charged-current weak 
interactions. We have described its use in high-energy neutrino scattering, 
but it has comparable successes in nuclear /3-decay, muon decay, and a variety 
of other processes. 

In a similar way, we can work out the effective Lagrangian resulting from 
virtual Z° exchange. We find 

A£z = 2k J z J “ Z 

V 2 (20.92) 

Q)fj > 

where the sum in the second line runs over all left-handed and right-handed 
flavors, and we have used relation (20.73) to simplify the prefactor. We say that 
the effective Lagrangian (20.92) mediates neutral current weak interaction 
processes. Notice that, if we define SU(‘2) gauge currents as 

jfia _ E ?YT a f, (20.93) 

/ 

then the effective Lagrangians of W and Z exchange can be written together 
in the simple form 

A C w + AC Z = ^ [(J" 1 ) 2 + (J" 2 ) 2 + (J" 3 - sin 2 . (20.94) 

This expression becomes manifestly invariant under an unbroken global 517(2) 
symmetry in the limit g' — > 0 or sin 2 0 W —> 0. We will discuss this observation 
further at the end of this section. 

The neutral current effective Lagrangian (20.92) contains terms that cou¬ 
ple together all of the various species of quarks and leptons. These terms 
violate parity, and so distinguish themselves from the effects of strong and 
electromagnetic interactions. For example, Eq. (20.92) predicts the existence 
of neutral current deep inelastic neutrino scattering events, in which a high- 
energy neutrino shatters a nucleon but does not convert to a final-state muon 
or electron. This process is analyzed in Problem 20.4. Similarly, the neutral- 
current interaction predicts the presence of parity-violating effects in electron 
deep inelastic scattering. It also predicts a parity-violating electron-nucleon 
interaction that should mix atomic energy levels, and a similar parity-violating 
nucleon-nucleon interaction. Within the GWS theory, the strengths of all of 
these various effects are predicted in terms of the Fermi constant and one ad¬ 
ditional parameter, the value of sin 2 6 W . Thus, the GWS theory can be tested 
by observing each of these effects and asking whether a single value of this 
parameter can account for the strengths of all of these disparate processes. 




710 Chapter 20 


Gauge Theories with Spontaneous Symmetry Breaking 


Figure 20.4. Diagrams contributing to the process e + e —> ff in the 

Glashow-Weinberg-Salam theory. 

Further tests of the GWS theory are available at higher energies. The 
process e + e _ —> ff is affected in an essential way, since the theory contains a 
new diagram with s-channel Z° exchange, which interferes with the standard 
photon exchange diagram, as shown in Fig. 20.4. It is straightforward to work 
out the effects of this interference using the methods of Section 5.2, so we 
have left this analysis as Problem 20.3. 

As the center-of-mass energy approaches mz, the Z° appears directly as 
a resonance in the e + e _ annihilation cross section. Similarly, both the W and 
the Z can be observed as resonances in quark-antiquark annihilation, viewed 
as a parton subprocess in proton-antiproton scattering. The positions of these 
resonances are predicted from Gf, sin 2 9 W , and the value of e or a , according 
to Eqs. (20.72) and (20.91). Using these relations, we find 


(20.95) 


The detailed shape of the Z° resonance is shown in Fig. 20.5. The experimental 
measurements shown are compared to a theoretical curve with the resonance 
position adjusted for the best fit. The height and width of the resonance are 
then predicted by the GWS theory. The resonance is broadened to higher 
energies by processes in which the electron and positron radiate collinear 
photons before annihilation; this correction was discussed in Problem 5.5. 

Because the Lagrangian of the GWS theory treats left- and right-handed 
fermions as distinct species with completely different quantum numbers, the 
couplings of the Z° to left- and right-handed fermions differ signficantly. One 
manifestation of this is the presence of a polarization asymmetry, a net polar¬ 
ization of fermions produced in the decay Z° —> ff, or an asymmetry in the 
inverse process of Z° production. This asymmetry can be read directly from 
the form of the Z° current given in (20.80): 


A f r(z° -» f L J n ) - r(z° -» f R J L ) 
LR r(Z° / L / fl )+r(Z°-> fnf L ) 

_ (? - \Qf | sin 2 9 W ) 2 - (Qf sin 2 0 W ) 2 


(tj - \Qf | sin 2 9 W )' 2 + (Qf sin 2 6» l(I ) 2 
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Figure 20.5. The total cross section for e + e“ annihilation to hadrons for 
E cm close to the Z 0 boson mass, as measured by the ALEPH, DELPHI, L3, 
and OPAL experiments and compiled by the Particle Data Group, Phys. Rev. 
D50, (1994), Fig. 32.14. References to the original articles are given there. 

The solid curve is the prediction of the GWS theory. 

For a realistic value sin 2 6 W = 0.23, this expression gives a 15% asymmetry for 
charged leptons and a 95% asymmetry for cl, s, and b quarks. The asymmetry 
can be checked experimentally for leptons by measuring the polarization of r 
leptons at the Z° resonance, or by measuring the relative cross sections for 
producing the resonance using left- versus right-handed electrons. For quarks, 
the asymmetry can be determined indirectly from the forward-backward pro¬ 
duction asymmetry on the resonance, as explained in Problem 20.3. 

Because the weak neutral current has so many different manifestations, 
the GWS theory of weak interactions can be subjected to a stringent test by 
comparing the values of the parameter sin 2 6 W needed to account for each of 
its predicted effects. Table 20.1 presents the values of sin 2 6 W extracted from 
a wide variety of weak interaction neutral current effects and asymmetries. In 
all cases, one-loop radiative corrections must be included to analyze the ex¬ 
periment at the required level of accuracy. These radiative corrections involve 
some subtlety. First, one must adopt a specific renormalization convention 
that defines sin 2 6 W and use it consistently in all calculations. The table shows 
results for two different choices of this convention. In both conventions, the 
values of weak-interaction observables are taken to be functions of a, Gf, and 
a third independent parameter. In the first column this parameter is the mass 
ratio mw/mz, and, following the tree-level expression (20.73), we consider 
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Table 20.1. Values of sin 2 9 W from Weak Interaction Experiments 


Observed Quantity or Process 

s 2 

b w 

shv 9 wms 

mz 

0.2247 (21) 

0.2320 (6) 

mw 

0.2264 (25) 

0.2338 (22) 

T z 

0.2250 (18) 

0.2322 (6) 

Lepton f-b asymmetries at the Z° 

0.2243 (17) 

0.2315 (11) 

All pair-production asymmetries at the Z° 

0.2245 (17) 

0.2317 (8) 

A e LR at the Z° 

0.2221 (17) 

0.2292 (10) 

Deep inelastic neutrino scattering 

0.2260 (48) 

0.233 (5) 

Neutrino-proton elastic scattering 

0.205 (31) 

0.212 (32) 

Neutrino-electron elastic scattering 

0.224 (9) 

0.231 (9) 

Atomic parity violation 

0.216 (8) 

0.223 (8) 

Parity violation in inelastic e~ scattering 

0.216 (17) 

0.223 (18) 


The values listed here are obtained by fitting experimental observations by 
adjusting the value of s^y or sin 2 # 10 ms , taking a and Gp as accurately known 
parameters. The numbers in parentheses are the standard errors in the last 
displayed digits. The conversion from the experimentally measured quantities 
to s'^ v or sin 2 9 W ms depends on the value of the top quark mass and the mass 
of the Higgs boson. These values assume a top quark mass of 169 GeV and a 
Higgs mass of 300 GeV; the quoted errors include an uncertainty of 17 GeV in 
the top quark mass and a range from 60 GeV to 1000 GeV for the Higgs mass. 

The differences in the relative errors between the two columns reflect the 
importance of this theoretical uncertainty. Some observables depend weakly 
on a s ; these values assume a s (mz) = 0.120 ± .007. This table is taken from 
the article of P. Langacker and J. Erler for the Particle Data Group, Phvs. 

Rev. D50, 1304 (1994). That article contains a full set of references and a 
discussion of the sources of uncertainty in these determinations. 

this ratio to define a renormalized value of sin 2 9 W : 

s'iy = 1 - (20.97) 

m\ 

In the second column, the third parameter is sin 2 9 W computed from the weak 
interaction coupling constants defined by minimal subtraction (Eq. (11.77)). 
The differences between different definitions of sin 2 9 W appear at the level 
of one-loop computations and can reveal interesting physics; this subject is 
discussed in Section 21.3. 

A second subtlety is that the one-loop corrections to weak neutral current 
processes depend on the value of the t quark mass, which has only recently 
been determined and is still somewhat poorly known. The dependence on the 
t quark mass is relatively strong, for interesting reasons that we will discuss 
in Section 21.3. The one-loop corrections also depend weakly on properties of 
the particles responsible for the spontaneous symmetry breaking. 
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We can see from Table 20.1 that a wide variety of effects due to the 
weak neutral current have been observed, with magnitudes accounted for by 
a single, consistent value of sin 2 # m . This remarkable concordance of theory 
and experiment gives us confidence that the Glashow-Weinberg-Salam theory 
is indeed the correct description of weak and electromagnetic interactions. 


Fermion Mass Terms 


We now return to the problem of writing mass terms for the quarks and 
leptons. Recall that one cannot put ordinary mass terms into the Lagrangian, 
because the left- and right-handed components of the various fermion fields 
have different gauge quantum numbers and so simple mass terms violate gauge 
invariance. To give masses to the quarks and leptons, we must again invoke 
the mechanism of spontaneous symmetry breaking. 

We began this section by assuming that a scalar field <p acquires a vacuum 
expectation value (20.58), in order to give mass to the W and Z bosons. This 
scalar field needed to be a spinor under SU(2) and to have Y = 1/2 in order 
to produce the correct pattern of gauge boson masses. With these quantum 
numbers, we can also write a gauge-invariant coupling linking e^, eR, and 
as follows: 

A C, e = —A e EL ■ £r T h.c. (20.98) 


Here the SU('2) indices of the doublets El and 4> are contracted; notice also 
that the charges Y of the various fields sum to zero. The parameter A e is a 
new dimensionless coupling constant. If we replace <j> in this expression by its 
vacuum expectation value (20.58), we obtain 

AL e = —— A e r; + h.c. + • • •. (20.99) 


This is a mass term for the electron. The size of the mass is set by the vacuum 
expectation value of <p , rescaled by the new dimensionless coupling: 


m e 




( 20 . 100 ) 


Since the electron mass is proportional to v, one might expect that the 
masses of the electron and the W boson should be of the same order. In fact, 
taking the observed values, m e /mw ~ 6 x 10 -6 . Since A e is a renormalizable 
coupling, it must be treated as an input to the theory. Thus the GWS theory 
allows the electron to be very light, but it cannot explain why the electron is 
so light compared to the W boson. 

We can write mass terms for the quark fields in the same way. Notice 
that, in the following expression, both terms are invariant under SU(2 ) and 
have zero net V : 


A Cg = -X d Q L ■ <f>d R - A u e ab Q La <p\u R + h.c. 


(20.101) 
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Substituting the vacuum expectation value of <f> from Eq. (20.58), these terms 
become 




-~j=\ dV cIlcIr 


—=\ u vu L UR + h.c. H- 

V'2 


( 20 . 102 ) 


standard mass terms for the d and u quarks. The GWS theory thus gives the 
relations 


md 



~ V2 K 


(20.103) 


As with the electron, the theory parametrizes but does not explain the small 
values of the d and u quark masses observed experimentally. 

When additional generations of quarks are introduced into the theory, 
there can be additional coupling terms that mix generations. Alternatively, 
we can diagonalize the Higgs couplings by choosing a new basis for the quark 
fields. We will show that this is always possible in Section 20.3. However, 
this simplification of the Higgs couplings causes a complication in the gauge 
couplings. Let 


u l L = (u L ,c L ,t L ) , d l L = (d,L, sl^l) (20.104) 


denote the up- and down-type quarks in their original basis, and let u'[ and 
d'[ denote the quarks in the basis that diagonalizes their Higgs couplings. This 
latter basis is the physical one, since it is the basis that diagonalizes the mass 
matrix. The two bases are related by unitary transformations: 

»i. d[ u;/d'/. (20.105) 

In this new basis, the W boson current takes the form 

4+ = ^i7"4 = -j=u$-f(UiU d ) ij( ti. (20.106) 

This expression is conventionally written 

J \v = ^ f'L'fVijJl , (20.107) 

where Vy is a unitary matrix called the Cabibbo-Kobayashi-Maskawa (CKM) 
matrix. The off-diagonal terms in Vy allow weak-interaction transitions be¬ 
tween quark generations. For example, restricting to two generations for sim¬ 
plicity and writing 

Vijd'l = cos 9 c d' L + sin . (20.108) 


the term proportional to sin# c allows an s quark to decay weakly to a u 
quark. We have made use of this structure in our discussion of the effective 
Lagrangian for K meson decays in Section 18.2. We will discuss CKM flavor 
mixing and its symmetry properties in more detail in Section 20.3. 

It is interesting to note that there is no term within the structure we 
have described that gives a mass to the neutrino. If we wanted to generalize 
Eq. (20.98) to allow a neutrino mass term, we would have to introduce a new 
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fermion field vr that is completely neutral under jSU( 2) xU{ 1). Then we could 
write the Higgs coupling 

A C„ = -A „e ab E La 4>\v R + h.c., (20.109) 

which would give the v e a mass, presumably comparable to that of the elec¬ 
tron. But we know from experiment that neutrino masses are extremely small; 
the mass of the v e is known to be less than 10 eV. This extreme suppression of 
the neutrino masses would be naturally explained if the states vr do not ex¬ 
ist. We will show in Section 20.3 that this assumption also implies that there 
are no transitions between leptons of different generations; this result is also 
in accord with very strong experimental bounds. 

The Higgs Boson 

This discussion of fermion mass generation emphasizes that the scalar field 
that causes spontaneous breaking of the gauge symmetry is an important 
ingredient in the structure of the Glashow-Weinberg-Salam theory. We should 
therefore ask whether it has any more direct manifestations. 

To investigate this question, we will work in the unitarity gauge , analogous 
to that used for the Abelian model in Eq. (20.12). Let us parametrize the scalar 
field cp by writing 

^ („ + /,(,))■ (20 - 110 > 

The two-component spinor on the right has an arbitrary real-valued lower 
component, given by the vacuum expectation value of cp plus a fluctuating 
real-valued field h(x) with ( h(x )) = 0. This spinor is acted on by a general 
SU(‘2 ) gauge transformation U(x) to produce the most general complex-valued 
two-component spinor. We can now make a gauge transformation to eliminate 
U(x) from the Lagrangian. This reduces <p to a field with one physical degree 
of freedom. 

An explicit renormalizable Lagrangian that leads to a vacuum expectation 
value for (p is 

£. = \D^\ 2 + - M<pU)' 2 - ( 20 . 111 ) 

The minimum of the potential energy occurs at 

v = (^) 1/2 . ( 20 . 112 ) 

In the unitarity gauge, the potential energy terms in (20.111) take the form 

jCy = —pi 2 h 2 — Xvh 3 — -jA/i 4 

= ~7; m h h2 ~ \j\ m h h 3 - -jA/i 4 . 


(20.113) 
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Figure 20.6. Feynman rules for the couplings of the Higgs boson to vector 
bosons, to fermions, and to itself. 

The quantum of the field h(x) is a scalar particle with mass 

nih = %/2/r = \j^ v - (20.114) 

This particle is known as the Higgs boson. As for the fermions in the GWS 
theory, the Higgs boson has a mass whose general magnitude is controlled by 
the vacuum expectation value v, but whose precise value is determined by a 
new, unspecified, renormalizable coupling constant. 

The expansion of the kinetic energy term of (20.111) in unitarity gauge 
yields the gauge boson mass term (20.62), plus additional terms involving the 
Higgs boson field. Explicitly, 

£k = \{d,h] 2 + [m 2 v W»+W- + \m 2 z Z»Z,] ■ (l + (20.115) 

where mw and m z are given by Eqs. (20.63). 

Finally, the fermion mass terms in Eqs. (20.98) and (20.101) lead to direct 
couplings of the Higgs boson to fermions. Evaluating these terms in unitarity 
gauge, we find that, for any quark or lepton flavor, the Higgs boson couples 
according to 

Cf = -m f Jf(l + ^). (20.116) 

The Higgs boson couplings in Eqs. (20.113), (20.115), and (20.116) lead to 
the Feynman rules shown in Fig. 20.6. 

In general, the couplings of the Higgs boson to other particles of the weak 
interaction theory are proportional to the masses of those particles. Thus, the 
particles that are most easily made in the laboratory have very weak couplings 
to the Higgs boson, which makes it difficult to observe this particle. In any 
event, the Higgs boson has not yet been found. As of this writing, the Higgs 
boson that we have just described has been searched for and excluded for 
values of m* below 60 GeV. If the self-coupling A is large, however, the Higgs 
boson could have a mass as large as 1000 GeV; thus, a large dynamic range 
remains unexplored. 
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The phenomenological properties of the Higgs boson are worked out in 
more detail in the Final Project of Part III. 

A Higgs Sector? 

Since there is no experimental evidence for the existence of the simple Higgs 
boson contained in the GWS model, it is worth asking whether the W and Z 
bosons might acquire mass by a more complicated mechanism. There are two 
aspects to this question. 

First, is it certain that the W and Z bosons are gauge bosons of a spon¬ 
taneously broken SU('2) x U( 1) symmetry? The evidence for this idea comes 
from the universality of the couplings of the various quarks and leptons to the 
weak interactions. This universality is tested in the fact that the same value of 
the Fermi constant describes all charged-current weak-interaction processes, 
and that this same strength of coupling combined with a single value of sin 2 9 W 
describes the whole range of weak neutral current phenomena. We have seen, 
especially in the discussion of Chapter 16, that the principle of local gauge 
invariance leads naturally to the prediction of universal, flavor-independent, 
coupling constants. No other principle is known that would explain this strik¬ 
ing regularity. Thus there is compelling evidence that the underlying theory 
of the weak interactions is a spontaneously broken gauge theory. 

However, it is quite possible that the mechanism of the spontaneous break¬ 
ing of SU( 2) x U( 1) is more complicated than the simple model of a single 
scalar field that we have written in Eq. (20.111). In principle, the breaking of 
SU(2 ) x {7(1) might be the result of the dynamics of a complicated new set of 
particles and interactions, which we will refer to as the Higgs sector'. Experi¬ 
ment gives us only three properties of this new sector: First, it must generate 
the masses of the quarks and leptons. Second, it must generate the masses of 
the W and Z bosons. The third piece of information, which is the only non¬ 
trivial one, comes from the relation (20.73) between weak boson masses in the 
GWS theory, 

mw = mz cos 0 W . (20.117) 

This relation is satisfied experimentally to better than 1% accuracy, that is, to 
the level of one-loop radiative corrections. Whatever complicated mechanism 
we invoke to generate the spontaneous breaking of SU(2 ) x 1/(1), it should 
reproduce this relation in a natural way. 

To understand the implications of relation (20.117), we must analyze the 
gauge boson mass matrix without assuming that SU( 2) x U( 1) is broken by 
the expectation value of a scalar field. Actually, it is possible to compute 
the gauge boson mass matrix under much less restrictive assumptions, using 
the argument given at the very end of Section 20.1. There we constructed 
the gauge boson mass matrix from the matrix elements for gauge currents to 
create or destroy Goldstone bosons. We will now show that relation (20.117) 
follows for a large class of models of SU( 2) x U( 1) breaking for which these 
matrix elements satisfy certain simple properties. 
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Any model of weak-interaction symmetry breaking must contain some set 
of fields that is responsible for the spontaneous breaking of 517(2) x 17(1). 
Think of this sector of the theory as a field theory with a global 517(2) x 17(1) 
symmetry, which is promoted to a local symmetry through its coupling to 
gauge bosons. In the theory with global symmetry, this symmetry is sponta¬ 
neously broken to 17(1). Since three continuous symmetries are spontaneously- 
broken, this sector must supply three Goldstone bosons, which will eventually 
be eaten by W + , W~ , and Z°. Call these three bosons 7 r a , where a = 1,2,3. 
Let be the 517(2) symmetry currents of the new sector, and let J^ be 
the 17(1) current. The gauge boson mass matrix will then be constructed from 
the matrix elements (20.46), which here take the form 

<0| J flA \ir b (p)) = -¥‘F A b , (20.118) 

with A = 1,2,3,!' and b = 1,2,3. Using the method of Eq. (20.55), we find 
that the gauge boson vacuum polarization contains the pole term 

~^(9AF A c )(g B F B c ), (20.119) 

summed over c, where g a = g for A = 1,2,3 and <ja = g' for A = Y. Then we 
can identify the gauge boson mass matrix as 

m\ B = g A g B F A c F B c . (20.120) 

To reproduce the known form of the weak gauge boson mass matrix, we 
must now place constraints on the F\. First we must insure that the photon 
remains massless. This follows if the linear combination of charges (20.69) 
annihilates the vacuum. In the language of Eq. (20.118), we must insist that 
the corresponding linear combination of currents cannot excite a Goldstone 
boson: 

<0| (J" 3 + J" y ) \iT a (p)) = 0. (20.121) 

We can also achieve relation (20.117), using the following additional assump¬ 
tion: The symmetry-breaking sector has an 517(2) global symmetry, under 
which the three Goldstone bosons and the three 517(2) gauge currents trans¬ 
form as triplets, which remains exact when the SU (2) gauge symmetry is spon¬ 
taneously broken. This global 517(2) symmetry implies that, if A = a = 1,2,3 
in Eq. (20.118), 

(0| |t T b (p)) = —iFp f 'S ab , (20.122) 

where F is a parameter with the dimensions of mass. Combining (20.122) and 
( 20 . 121 ), we have 

<0| .J llY |tt 3 (p)) = +iFjf. (20.123) 

Inserting this form for F\ into (20.120), we find the gauge boson mass matrix 



m 2 = F' 2 


(20.124) 
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where the matrix acts on the gauge boson (A },, A? , A^, B fl ). The eigenvectors 
of this matrix are precisely (20.63) and (20.64). To reproduce the eigenvalues, 
we need only define 

v = 2 F. (20.125) 

We have now shown that the GWS relation between the W and Z boson 
masses is not special to the situation in which the gauge symmetry is broken 
by a single scalar field. This relation follows from the much more general as¬ 
sumption of an unbroken global SU(2 ) symmetry of the Higgs sector. This 
symmetry is often called custodial SU{ 2) symmetry.* We have seen this sym¬ 
metry already as the global SU{2) symmetry of the weak-interaction effective 
Lagrangian (20.94). 

For the case of a single scalar field, the custodial symmetry arises in the 
following way: If we write the field cp in terms of its four real components, the 
Lagrangian (20.111) (ignoring the gauge couplings) has 0(4) global symmetry. 
The vacuum expectation value of <p breaks this symmetry down to 0(3), that 
is, 50(2). 

However, there are many other quantum field theories that break 50(2) 
spontaneously while leaving another global 50(2) symmetry unbroken. One 
rather complex example is given by QCD with two massless flavors, if we 
identify the gauged 50(2) with the symmetry generated by Ul in (19.82) 
and identify the custodial 50(2) with vectorial isospin symmetry. A copy 
of the familiar strong interactions with a mass scale large enough to give 
F = 125 GeV would be a perfectly acceptable model for the Higgs sector. 
(Unfortunately, it is not easy in this model to generate masses for the quarks 
and leptons.) 

The question of the nature of the Higgs sector and the explicit mechanism 
of 50(2) x 0(1) breaking is probably the most pressing open problem in the 
theory of elementary particles. We will discuss this question further in the 
Epilogue. 

20.3 Symmetries of the Theory of Quarks and Leptons 

Putting together the theory of strong interactions described in Chapter 17 and 
the theory of weak and electromagnetic interactions described in the previous 
section, we have now constructed a complete description of elementary particle 
interactions. It is interesting to investigate the symmetries of this theory, to 
ask what might be the fundamental symmetries of the underlying description 
of Nature. 

We have already seen, in the arguments leading up to Eq. (15.17), that 
the Lagrangian of a gauge theory is highly restricted by the conditions of 
renormalizability and gauge invariance. In this section, we will construct the 


*P. Sikivie, L. Susskind, M. Voloshin, and V. Zakharov, Nucl. Phvs. B173, 189 
(1980). 
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most general renormalizable Lagrangian consistent with the SU( 3) x SU( 2) x 
L T (1) gauge symmetries of the strong, weak and electromagnetic interactions. 
We can then ask what further global symmetries we must impose on this 
theory in order to give it the global symmetries that we see in Nature. 

As a first step, we will ignore the Higgs scalar field and the mass terms 
of quarks, leptons, and gauge bosons. Then the Lagrangian of the theory of 
quarks and leptons is entirely specified by gauge invariance and renormaliz- 
ability. We have 

£k = £(^) 2 + (20.126) 

i J 

where the index i runs over the three factors of the gauge group and the index 
J runs over the various multiplets of chiral fermions. 

In principle, we could add to (20.126) the following pseudoscalar pure 
gauge operators: 

= E F i,,rr-7x.r- (20-127) 

i 

These terms are apparently odd under both P and T. However, we saw at the 
end of Section 19.2 that terms of this form can be generated or canceled by 
making a change of variables in the effective action. For example, the change 
of variables on the right-handed electron field 

e R ->■ e ia e R (20.128) 

produces, according to (19.78) or (19.79), a correction to the Lagrangian in¬ 
volving the P- and T-odd combination of field strengths for the U( 1) gauge 
field 

AC = a-^_e^ x " F ^ F Xa . (20.129) 

The coefficient of (20.129) differs from the corresponding coefficient in (19.79) 
because we transform only the right-handed chiral component of the electron 
field. If we were to transform another fermion field, of hypercharge Y, we 
would find a similar shift, with the coefficient proportional to Y 2 . If this new 
field coupled to the SU( 2) or SU( 3) gauge fields, we would also find terms 
proportional to those field strengths. Thus, we can eliminate the term in 

(20.127) involving the U( 1) field strengths by making the change of variables 

(20.128) with a = — ^6\. We can eliminate all three terms in (20.127) by 
making appropriate chiral rotations on three fermion multiplets, say, e R , E R , 
and Ql • The change of variables (20.128), which rotates the right-handed 
electron field, is not symmetric under parity and, in fact, changes the definition 
of the parity operation. By making this change of variables, we are choosing 
new coordinates in which the P and T transformation properties of the whole 
theory are as simple as possible. 

Let us now investigate the properties of the Lagrangian (20.126) under 
P, C, and T. The couplings of the QCD gauge bosons are invariant to each 
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of these symmetries separately. However, the couplings of the SU{2) gauge 
bosons violate P and G as much as possible. Recall from Section 3.6 that P 
converts a left-handed electron to a right-handed electron, and that C converts 
a left-handed electron to a left-handed positron. Each of these operations 
converts a particle that couples to SU( 2) gauge bosons to one that does not. 
However, the combination of these two operations interchanges left-handed 
particles with right-handed antiparticles. Thus the combined operation CP is 
a symmetry of (20.126). This Lagrangian is also invariant under time reversal. 

Thus, we see that the discrete symmetries of C and P, on the one hand, 
and CP and T, on the other, stand on a very different footing in gauge field 
theories. Any chiral gauge theory will naturally violate C and P. At this 
level in our analysis, it is a mystery why C and P should be observed to be 
approximate symmetries of Nature. On the other hand, every theory of gauge 
bosons and massless fermions respects CP and T. It is known experimentally 
that Nature contains some interaction that violates CP, since the CP selection 
rules are weakly violated in the decays of the K° meson. But to find a source 
for this violation, we must add terms to our basic gauge theory (20.126). 

We must, first of all, add dynamics to (20.126) that will cause the sponta¬ 
neous breaking of SU( 2) x U( 1). We will begin by working with the simplest 
model with one Higgs scalar field cp. The most general renormalizable La¬ 
grangian for cp is 

C? = \D li( p\ 2 + - A {cpUY- (20.130) 

The Hermiticity of C# implies that the parameters /r and A are real. Thus 
this Lagrangian respects P, C, and T. As discussed at the end of the previous 
section, this Lagrangian also automatically has the custodial SU( 2) symmetry 
required to produce the mass relation (20.117). 

Finally, we must add the terms that couple the Higgs field to the quarks 
and leptons. Here, renormalizability and gauge invariance provide the weakest 
constraints, and there are many allowed interactions. We will first analyze the 
coupling of cp to the quark fields, and then generalize this discussion to the 
leptons. 

In writing the Higgs field couplings to the quarks, we should recall that 
there are known to be three generations of quarks and leptons. Thus there are 
three doublets of left-handed quarks: 


Q). = 


(20.131) 


There are six right-handed quarks, three with V = | and three with Y = — A; 

U R = { u Ri c R^r) i d* R = (d,R, SR,bR). (20.132) 


When we couple gauge fields to these quarks, we replace the ordinary deriva¬ 
tives with covariant derivatives. This automatically gives all of the quarks the 
same coupling to QCD and all quarks of the same type the same coupling to 
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the weak interactions. It does not allow mixing between the various quark fla¬ 
vors. However, the coupling of the Higgs field to the quarks does not follow 
from a gauge principle and so need not have any of these restrictions. Unless 
we require quark flavor conservation by postulating a new discrete symmetry 
of the theory, the Higgs couplings will, in general, mix the various flavors. 

If we do not impose any additional symmetries on the theory, we must 
write the most general renormalizable gauge-invariant coupling with the struc¬ 
ture of Eq. (20.101): 

An = K/Ql. '"4 ^ ab QLa4>t< + (20-133) 

where and A*/ are general, not necessarily symmetric or Hermitian, 
complex-valued matrices. The operation of CP interchanges the operators 
written in (20.133) with their Hermitian conjugates without changing the co¬ 
efficients; thus, CP is equivalent to the substitutions 

Ki > (Ayr, xj > u;/r. ( 20 . 134 ) 

CP would be a symmetry of (20.133) if the matrices A u were real-valued; 
however, there is no principle that requires this. Without the imposition of 
further symmetry requirements, it seems that (20.133) does maximum violence 
to all discrete and flavor conservation symmetries. 

However, just as we were able to eliminate the T-violating terms (20.127) 
by making a chiral rotation, we can simplify the form of (20.133) by appropri¬ 
ate chiral transformations. To find the required transformations, diagonalize 
the Hermitian matrices obtained by squaring A^ and \ u . Define unitary ma¬ 
trices U u and U ), by 

A u a£ = UuDlVl At A, u = ll„/^ir,i, (20.135) 

where D' 2 U is a diagonal matrix with positive eigenvalues. Then 

A„ = U u D u Wl (20.136) 

where D, u is the diagonal matrix whose diagonal elements are the positive 
square roots of the eigenvalues of (20.135). We can define unitary matrices Ud 
and Wd in a similar way and decompose A d. as 

= UdD d wl (20.137) 

Now make the change of variables 

> "7/'"/,- 4 > W* j 4- (20.138) 

This eliminates the unitary matrices W u and Wd from the Higgs coupling 
(20.133). Since each of the three u R and each of the three d‘ R have the same 
coupling to the gauge fields, W u and Wd commute with the corresponding 
covariant derivatives. Thus, under (20.138), 

E(4W4 + 4(^)4) E(4(49)4 + 4(^)4), ( 20 . 139 ) 
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and so W u and W d disappear from the theory. 

The analogous transformation on the left-handed fields also makes a dra¬ 
matic simplification. Make the change of variables 

4 4 >o ; 4- (2o.i4o) 


This transformation eliminates U„, U d from the terms in (20.133) that involve 
the lower component of the Higgs field. In unitarity gauge, only these terms 
survive. By combining the diagonal elements of D u and D d with the vacuum 
expectation value of the Higgs field, we can relate these elements to quark 
masses: 


777* — - D U V 

,n u y/2 U U ^ 


m d = 


= -^D>- 


V2 


(20.141) 


With this replacement, (20.133) takes the form 


Cm = -m' d ct L d J R (l + - ///>',. Ii',. (l + ^ + h.c. (20.142) 

This has the standard form of quark mass terms and Higgs boson couplings. 
The transformations (20.138) and (20.140) thus convert the quark fields to the 
basis of mass eigenstates. In this basis, the mass terms and Higgs couplings 
are diagonal in flavor and conserve P, C, and T. 

Since left-handed u and d quarks have identical couplings to QCD, the 
matrices U u and U d commute with the QCD couplings in the covariant deriva¬ 
tive. However, ul and d.L are mixed by the weak interactions, and so we must 
investigate the effect of (20.140) on the SU( 2) x U(l) couplings more carefully. 
This is most easily done by referring to the Lagrangian (20.79). The matri¬ 
ces U u and U d cancel out of the pure kinetic terms in the first line of (20.79). 
They also cancel out of the electromagnetic current : for example, 

uW lu L -f u i L Ul ij y , Uj k u l l = u^Y'ui. (20.143) 

By the same logic, U u and U d cancel out of the Z° boson current. 

However, in the current that couples to the W boson field, we find 


J " + = 7^ 


i 

V2 l 


(utu d y 


dl 


(20.144) 


That is, the charge-changing weak interactions link the three u’ L quarks with 
a unitary rotation of the triplet of d' L quarks, with this rotation given by the 
unitary matrix 


V = utu d . 


(20.145) 


The matrix V is known as the Cabibbo-Kobayashi-Maskawa (CKM) mixing 
matrix. 

The matrix V can have complex elements, but we can remove phases 
from V by performing phase rotations of the various quark fields. Before 
analyzing the case of three generations, it is useful to consider the case of two 
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generations— u, d , c, and s. In this case, V is a 2 x 2 unitary matrix. Such a 
matrix has 4 parameters; we can write its most general form as 


V = 


cos 6 c e' a sin 6 c e‘ l3 \ 

— sin 6 c e t< '°^0 costae'' : ~ y ) 


(20.146) 


One parameter of V is a rotation angle, and the other three are phases. We 
can remove these phases by performing the change of variables on the quark 
fields 

qi expKM- (20.147) 


This global phase rotation has no effect on any term of the Lagrangian except 
for the weak charged current (20.144). A phase rotation that is equal for all 
four quark flavors cancels out of (20.144). However, the other three possible 
phase transformations are just what we need to eliminate a , /?, and 7 . 

When we have chosen the phases of the quark fields in this way, V takes 
the form 


( cos 6 C 
Y — sin 9 C 



(20.148) 


Then the quark terms in the weak charged current can be written 

J M+ = -j= (cos6cUL^d.L + sm0 c UL ^sl — sinOcCL^f^dL + cos9 c cl^ ij sl) ■ 
V‘2 

(20.149) 

We have already seen, in Eqs. (18.31) and (18.32), that this is the way the 
s quark enters the weak interactions. The angle 9 C is the Cabibbo angle, as 
defined in Eq. (18.30). 

The same set of arguments can be made for the theory with three genera¬ 
tions. Here V is a general unitary 3x3 matrix. Such a matrix has 9 parameters. 
Of these, 3 are rotation angles; this is the number of parameters of an 0(3) 
rotation. The remaining 6 parameters are phases. We can remove these phases 
by making phase rotations of quark fields as in (20.147), but the overall phase 
is redundant, so we can remove only 5 of these phases. The final form of V 
contains 3 angles, of which one is the Cabibbo angle, and one phase. After 
all the transformations we have made, this one phase that makes some cou¬ 
plings of the W + to quarks complex is the only remaining parameter that 
violates CP. 

We began this argument from a Lagrangian for the quark-Higgs boson 
coupling that seemed to violate all possible flavor symmetries and all dis¬ 
crete spacetime symmetries. However, by making changes of variables on the 
fermion fields, we have been able to dramatically simplify the form of the La¬ 
grangian. If we keep only those terms involving the massless gauge bosons, 
the photon and the gluons, plus the mass terms and interactions written 
in (20.142), we see that this set of terms conserves P, C, T, and all flavor 
symmetries. This dramatic simplification occurs because the unbroken gauge 
symmetry of Nature, the gauge symmetry of QCD and QED, is nonchiral and 
can be written as acting on Dirac fermions. Since we have omitted only the 
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Figure 20.7. Higher-order diagrams that seem to give the leading contri¬ 
butions to flavor-changing weak neutral current processes: (a) K° —>- 
(b ) K° K°. 

effects mediated by the massive W and Z bosons, this much of the analysis 
already guarantees that Nature will appear, to a high degree of approxima¬ 
tion, to respect the three separate discrete symmetries and all quark flavor 
conservation laws. Notice that we did not assume any fundamental global sym¬ 
metries, but depended only on the assignment of gauge quantum numbers in 
the SU( 3) x SU( 2) x U(l) gauge theory. 

If we include the Z boson and the weak neutral current, we have a theory 
that violates P and C through Z exchange but that respects CP. In addition, 
this theory respects all flavor conservation laws. We describe this situation by 
saying that there is no flavor-changing weak neutral current. The experimental 
evidence for this statement is quite impressive. The best tests come from the 
study of the neutral K° meson, which is an sd bound state and so could decay 
by Z° exchange if this boson coupled to a flavor-changing current. In fact, the 
decay K° —»■ is highly suppressed, to the level of the one-loop weak 

interaction correction shown in Fig. 20.7(a). Similarly, the interconversion of 
K° and K , which could proceed directly if the Z° could change flavor, is 
suppressed to the level of the contribution shown in Fig. 20.7(b). 

On the other hand, W bosons couple to currents that can change quark 
flavor, in a pattern parametrized by the Cabibbo angle and the other angles 
in the CKM matrix. Thus, heavy quark flavors decay by W boson exchange 
processes. Since the W couples to a current that contains only left-handed 
quarks, it mediates an interaction that violates P and C maximally. This 
violation of discrete symmetries is concealed from our ordinary experience be¬ 
cause the amplitude for W exchange is small. However, this P and C violation 
is a dramatic qualitative feature of weak decays. 

Since the coupling of the W to quarks contains an irreducible phase, these 
couplings in principle can violate CP. However, we have seen that this phase 
can be removed in a theory with only two generations. This means that the 
phase of the CKM matrix can have physical consequences only in a process 
that involves all three generations. Typically, this means that the CKM phase 
can contribute only to weak interaction loop corrections or to complicated 
exclusive decay processes. Thus the SU( 3) x SU( 2) xU( 1) theory can account 
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for CP violation, and also explains why this effect is much weaker even than 
the weak interactions. It is interesting to note that Kobavashi and Maskawa 
originally proposed the existence of the third generation in order to provide a 
mechanism for CP violation.! 

On the other hand, at this moment there is no conclusive evidence that 
the origin of CP violation is indeed the phase of the CKM matrix. All of the 
arguments we have given in this section have used the simplest model of the 
Higgs sector, in which this sector consists of a single scalar field. More general 
models of the Higgs sector may leave behind a more complicated set of quark- 
Higgs couplings than appear in (20.142), and some of these may violate CP. 
In addition, there may be terms in the Higgs sector itself that lead to CP 
violation. The origin of the observed CP violation is still an open problem 
that needs both theoretical and experimental exploration. 

Before leaving this subject, we must discuss one more aspect of this argu¬ 
ment that is still mysterious. To simplify the Lagrangian of the gauge theory 
of quarks to its final form, we needed to make chiral changes of variables in 
the functional integral. We saw in Section 19.2, and we reviewed at the begin¬ 
ning of this section, that such changes of variables produce the new P- and 
T-violating terms written in Eq. (20.127). It can be shown, using the fact that 
these terms are total derivatives, that the terms involving SU(‘2) and U( 1) 
field strengths have no observable effects. However, the term involving QCD 
field strengths can induce an electric dipole moment for the neutron, a T- 
violating effect that has been searched for and excluded at an impressive level 
of accuracy. Thus the P- and T-violating combination of QCD field strengths 
cannnot be allowed to appear in the Lagrangian. On the other hand, if the 
original up and down quark Higgs coupling matrices were of the most general 
possible form, it seems that this cannot be avoided. This problem is known 
as the strong CP problem. To solve this problem, one must either constrain 
the Higgs coupling matrices, violating the spirit of the argument we have just 
concluded, or one must add additional structure to the Higgs sector.! 

Finally, let us discuss the general form and simplification of the Higgs 
boson couplings to leptons. When we wrote the Glashow-Weinberg-Salam La¬ 
grangian in the previous section, we noted that no gauge field coupled to the 
right-handed neutrino. Thus, we chose to eliminate this particle from the the¬ 
ory. We might need right-handed components of the neutrinos to construct 
neutrino mass terms, but at the moment there is no evidence for nonzero 
neutrino masses. Thus, in the remainder of this section, we will assume that 
there are no right-handed neutrinos and work out the consequences of this 
assumption.* 

I'M. Kobayaski and T. Maskawa, Prog. Theor. Plivs. 49, 652 (1973). 

+ The strong CP problem, its proposed solutions, and their unexpected implica¬ 
tions are reviewed by R. D. Peccei in CP Violation, C. Jarlskog, ed. (World Scientific, 
1989). 

*In generalizations of the SU( 2) x 17(1) model, neutrinos can acquire Majorana 
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Generalizing Eq. (20.133), we can write the most general coupling of a 
Higgs boson to three generations of leptons. Since there are no right-handed 
neutrinos, the only possible coupling is 

C m = -4>e J R + h.c. (20.150) 

To diagonalize this coupling, represent X( in the form 

X( = U(D(Wj, (20.151) 

and eliminate the matrices U( and We by the changes of variables 

A >'i. >r;x, a (20.152) 

Since we are now making the same change of variables on the two components 
of the weak doublet E R , this change of variables commutes with the SU( 2) 
interactions in the covariant derivative. Thus the unitary matrices U( and We 
completely disappear from the theory. The result is a theory of leptons that 
conserves CP exactly and also conserves the lepton number of each generation. 
This last result is very accurately tested experimentally. For example, there 
is no evidence for the generation-changing muon decay processes —» e _ y 
or \C —y e~e~e + ; the branching ratios for these processes are known to be 
below 10 -10 . 

We have seen, then, that the SU( 3) x 517(2) x 17(1) gauge theory of 
quarks and leptons does an excellent job of accounting for the symmetries 
and conservation laws that are observed in elementary particle phenomena. 
It predicts which symmetries should be exact in Nature and which should be 
approximate. For approximate symmetries, it gives an accurate estimate of 
the level of symmetry violation. Most remarkably (except for the one issue 
of the strong CP problem), none of these predictions depend on any under¬ 
lying global discrete or flavor symmetries in the fundamental equations. The 
global symmetries that we observe in Nature follow only from gauge invari¬ 
ance and the specific representation assignments that we made in constructing 
our gauge theory description. 


mass terms that are naturally very small. These models also respect the constraints 
on lepton flavor mixing described in the next paragraph. For an introduction to these 
ideas on neutrino mass, see P. Ramond, in Perspectives in the Standard Model, R. K. 
Ellis, C. T. Hill, and J. D. Lykken, eds. (World Scientific, 1992). 
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Problems 


20.1 Spontaneous breaking of SU(5). Consider a gauge theory with the gauge 
group SU(5), coupled to a scalar field $ in the adjoint representation. Assume that 
the potential for this scalar field forces it to acquire a nonzero vacuum expectation 
value. Two possible choices for this expectation value are 


c * ^ 


r 2 

2 

\ 

1 

1 

and (<f>) = B 

2 

-3 

V —4 / 



-3/ 


For each case, work out the spectrum of gauge bosons and the unbroken symmetry 
group. 


20.2 Decay inodes of the W and Z bosons. 

(a) Compute the partial decay widths of the W boson into pairs of quarks and 
leptons. Assume that the top quark mass mt is larger than m\y, and ignore 
the other quark masses. The decay widths to quarks are enhanced by QCD 
corrections. Show that the correction is given, to order a s , by Eq. (17.9). Using 
sin 2 9 W = 0.23, find a numerical value for the total width of the W + , 

(b) Compute the partial decay widths of the Z boson into pairs of quarks and 
leptons, treating the quarks in the same way as in part (a). Determine the total 
width of the Z boson and the fractions of the decays that give hadrons, charged 
leptons, and invisible modes in'. 

20.3 e+e~ —» hadrons with photon-.Z 0 interference. 

(a) Consider a fermion species / with electric charge Q f and weak isospin /| for its 
left-handed component. Ignore the mass of the /. Compute the differential cross 
section for the process e + e - —> ff in the standard electroweak model. Include 
the effect of the Z° width using the Breit-Wigner formula, Eq. (7.60). Plot the 
behavior of the total cross section as a function of CM energy through the Z° 
resonance, for u, d, and fi. 

(b) Compute the forward-backward asymmetry for e + e —S- //, defined as 


(Jo 1 — J°_^)dcos8{d(j / dcos 9) 
(Jq + /° 1 )dcos 6(da/dcos 9) 


as a function of center of mass energy. 


( c ) 


Show that, just on the Z° resonance, the forward-backward asymmetry is given 
by 



A 


f 

LR- 


(d) Show that the cross section at the peak of the Z° resonance is given by 

12tt T(Z° e+e-)r(Z° ->• ff) 


^peak o 
mi 


r 2 
1 z 
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where Tz is the total width of the Z°. Notice that both the total width of the 
Z° and the peak height are affected by the presence of extra invisible decay 
modes. Compute the shifts in T z and cr pea k that would be produced by a hy¬ 
pothetical fourth neutrino species, and compare these shifts to the cross section 
measurements shown in Fig. 20.5. 

20.4 Neutral-current deep inelastic scattering. 

(a) In Eq. (17.35), we wrote formulae for neutrino and antineutrino deep inelastic 
scattering with W± exchange. Neutrinos and antineutrinos can also scatter by 
exchanging a Z°. This process, which leads to a hadronic jet but no observable 
outgoing lepton, is called the neutral current reaction. Compute da/dxdy for 
neutral current deep inelastic scattering of neutrinos and antineutrinos from 
protons, accounting for scattering from u and d quarks and antiquarks. 

(b) Next, consider deep inelastic scattering from a nucleus A with equal numbers 
of protons and neutrons. For such a target, f u (.x) = fy(x ), and similarly for 
antiquarks. Show that the formulae in part (a) simplify in such a situation. In 
particular, let R v , R v be defined as 

„ da/dxdy{vA —s- vX) dajdxdy(vA —s- vX) 

da / dxdy(vA —> y~X) ’ da/dxdy(vA —y yAX) 

Show that R v and R v are given by the following simple formulae: 

/>’" - ^ sin 2 9 W + ^ sin 4 0 W (1 + r), 

R U = \~ sin 2 9 W + ^ sin 4 0 W (1 + i), 

where 

da / dxdy(vA yAX) 
da/dxdy(vA—i y~X) 

These formulae remain true when R v and R v are redefined to be the ratios of 
neutral- to charged-current cross sections integrated over the region of x and y 
that is observed in a given experiment. 

(c) By setting r equal to the observed value—say, r = 0.4—and varying sin 2 9 W , 
the relations of part (b) generate a curve in the plane of R v versus R v that is 
known as Weinberg’s nose. Sketch this curve. The observed values of R v , R v lie 
close to this curve, near the point corresponding to sin 2 9 W = 0.23. 

20.5 A model with two Higgs fields. 

(a) Consider a model with two scalar fields <f>\ and d>o, which transform as 51/(2) 
doublets with Y = 1/2. Assume that the two fields acquire parallel vacuum 
expectation values of the form (20.23) with vacuum expectation values v\, no- 
Show that these vacuum expectation values produce the same gauge boson mass 
matrix that we found in Section 20.2, with the replacement 

v 2 ^(vj + vl). 

(b) The most general potential function for a model with two Higgs doublets is quite 
complex. However, if we impose the discrete symmetry <pi —> —<px, <j >2 
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the most general potential is 

+ ^ 3 (‘p\<Pi ){<t>\<t> 2 ) + ^ 4 ( 0 i < h)(‘pl l Pi) + y + h.c.). 

Find conditions on the parameters and A,; so that the configuration of vac¬ 
uum expectation values required in part (a) is a locally stable minimum of this 
potential. 

(c) In the unitarity gauge, one linear combination of the upper components of 0\ 
and 02 is eliminated, while the other remains as a physical field. Show that the 
physical charged Higgs field has the form 

0 + = sin f3 0+ — cos 13 0t , 

where /3 is defined by the relation 

, a r-2 

tan [3 = —. 
v\ 

(d) Assume that the two Higgs fields couple to quarks by the set of fundamental 
couplings 

= -Aj' Ql ■ 0ld J R - Q , La < l ) t b U J R + ll.C. 

Find the couplings of the physical charged Higgs boson of part (c) to the mass 
eigenstates of quarks. These couplings depend only on the values of the quark 
masses and tan/3 and on the elements of the CKM matrix. 



Chapter 21 


Quantization of Spontaneously 
Broken Gauge Theories 


In Chapter 20 we saw that when a gauge symmetry is spontaneously broken, 
the gauge bosons acquire mass. This phenomenon allowed us to construct a 
realistic theory of the weak interactions. Up to this point, however, we have 
discussed spontaneously broken gauge theories only in a simplistic way. To 
isolate the physical degrees of freedom, we have used the device of going to the 
unitarity gauge. However, it is not at all clear what the rules of perturbation 
theory are in this gauge, or how the unitarity gauge constraint is maintained 
when we compute Feynman diagrams. We have also seen that the Goldstone 
bosons that are absorbed into the massive gauge bosons play an important 
role in formal arguments about these theories, so we would like to quantize 
these theories in a gauge that does not eliminate these particles from the 
beginning. 

In this chapter we will address these problems, by carrying out the for¬ 
mal gauge-fixing of theories with spontaneously broken gauge symmetry us¬ 
ing the Faddeev-Popov method. We will define a class of gauges, called the 
R 5 gauges, almost all of which contain the Goldstone bosons of the original 
spontaneous symmetry breaking. These particles cancel the effects of other 
unphysical particles in the formalism to maintain the unitarity of the theory. 
These cancellations are a more intricate version of the cancellations between 
gauge and ghost degrees of freedom that we saw in Chapter 16. However, we 
will see in Section 21.2 that a theory does not forget that it contains Goldstone 
bosons and that, under some circumstances, the properties of the Goldstone 
bosons in the theory without gauge couplings can carry over to the theory 
with massive gauge bosons. 

Finally, having defined the perturbation theory and clarified the role of the 
Goldstone bosons in spontaneously broken gauge theories, we will carry out 
some explicit loop calculations of interest in the theory of weak interactions. 
Here we will see applications of the ideas of Chapter 11, that a theory with 
spontaneously broken symmetry can be renormalized with the counterterms 
of the symmetric Lagrangian. In Section 21.3 we will show through some 
examples that this result applies with equal force to gauge theories, and that 
it endows the weak-interaction gauge theory with substantial predictive power. 


731 
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21.1 The R Gauges 

In our discussion of the low-energy effective Lagrangian for weak interactions, 
we proposed in Eq. (20.89) the following expression for the propagator of a 
massive gauge boson: 

(A>’{p)A U {-p)) = • (21.1) 

p- — m- 

This expression is a natural first guess, generalizing the Feynman-‘t Hooft 
gauge. However, it is unsatisfactory in a number of ways. 

The most important of these defects concerns the treatment of gauge 
boson polarization states. The propagator (21.1) contains four components, 
corresponding to the transverse, longitudinal, and timelike polarizations. We 
saw in Chapters 5 and 16 that, for massless gauge bosons, the unphysical 
longitudinal and timelike components cancel in computations. For a massive 
gauge boson, however, the longitudinal polarization state corresponds to a 
real physical particle; we do not want it to cancel. Expression (21.1) does not 
take this change into account. 

An Abelian Example 

To understand this and other formal problems that arise for gauge theories 
with spontaneously broken symmetry, we need to carefully redo the Faddeev- 
Popov quantization of these theories. To begin, we will quantize the sponta¬ 
neously broken Abelian gauge theory introduced in Eq. (20.1): 

C }(/',„)-+ I)„q' 2 V(o), (21.2) 

with D fl = d/j + ieA^. Here cp(x) is a complex scalar field. However, it will 
be most convenient to analyze the model by writing cp in terms of its real 
components, 


Then the infinitesimal local symmetry transformation is 

Sep 1 = — a(x)<p 2 , Sep 2 = n (x)o'-j SA fl = ——d t ,a. (21.4) 

Let us assume that V{ep ) forces the scalar field to acquire a vacuum ex¬ 
pectation value: (< p 1 ) = v. Then we should change variables by a shift: 

cp 1 (x) = v + h(x); <p 2 = (p. (21.5) 

The field cp 2 or cp is the Goldstone boson. The Lagrangian (21.2) now takes 
the form 

C = -\{F fn/ ) 2 + y(<9 fl h - eA^Y + | (d^ip + eA^v + h)Y - V{cp). (21.6) 


( 0 1 +^). 


(21.3) 
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This Lagrangian is still invariant under an exact local symmetry, 

Sh = — a(x)tp , Sip = a(x)(v + h), SA M = —-d t ,a. (21.7) 

Thus, in order to define the functional integral over the variables (h,p,A fl ), 
we must introduce Faddeev-Popov gauge fixing. 

Starting from the functional integral 

Z = J VAVhVtp £[A,hM , (21.8) 

we can introduce a gauge-fixing constraint as we did in Section 9.4. Following 
the steps leading from Eq. (9.50) to Eq. (9.54), we find 

Z = C ■ j VAVhVtp S(G(A,h,<p)) det(^), (21.9) 


where C is a constant proportional to the volume of the gauge group and 
G(A,h,tp) is a gauge-fixing condition. Alternatively, we can introduce the 
gauge-fixing constraint as S(G(x) — cv(x)) and integrate over lo(x) with a 
Gaussian weight, as in the derivation of Eq. (9.56). This gives 

Z = C' ■ J VAVhThp exp \i jd 4 x(C[A,h,p\ - ±(G') 2 )] det (^)- ( 21 -!0) 


The gauge-fixing function G is arbitrary, but we can simplify our formalism 
by choosing it appropriately. 

An especially convenient choice of the gauge-fixing function is 

G=^=(d^-^evtp). ( 21 . 11 ) 


When we form G 2 , the term quadratic in A M will provide the same gauge- 
dependent addition to the gauge field action that we saw in the derivation 
of Eqs. (9.58) and (16.29). In addition, the cross term between A tJ and <p is 
engineered to cancel the quadratic term of the form d fl tpA coming from the 
third term of (21.6). With this choice, the quadratic terms of the gauge-fixed 
Lagrangian (L — \G 2 ) are 


C-2 = -\A^(-g^d 2 + (1 - i)cF3" - MV")^ 
+ i m\h 2 + \{d,p) 2 - 


( 21 . 12 ) 


The mass term for the h field comes from the expansion of V((j>), as in (20.6). 
The mass term for the gauge field comes from the Higgs mechanism, that is, 
from the third term of (21.6). Notice that the formalism also produces a mass 
for the Goldstone boson tp: 


m 


2 

v 


£(eu) 2 = im\. 


(21.13) 
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The fact that this mass is gauge-dependent is a signal that the Goldstone 
boson is a fictitious field, which will not be produced in physical processes. 

To complete the Faddeev-Popov quantization procedure, we must derive 
the Lagrangian of the ghosts. This Lagrangian depends on the gauge variation 
of G, which can be computed by inserting (21.7) into (21.11). We find 

~£, ev ( v + h ))- (21.14) 

The determinant of this operator can be accounted for by including a set of 
Faddeev-Popov ghosts with the Lagrangian, 

£ g host = c[-<9 2 -£m A ( 1 + ~)Jc, (21.15) 

where m A = ev as in Eq. (21.13). Since this is an Abelian gauge theory, the 
ghost field does not couple directly to the gauge field. It does, however, couple 
to the physical Higgs field, so it cannot be completely ignored as in QED. 

^From the quadratic terms in the Lagrangians for A M , h, tp, and the ghosts, 
we can readily find the propagators for these fields. All four propagators are 
shown in Fig. 21.1. The only complicated case is that of the gauge field. The 
term in (21.12) involving A M involves an operator whose Fourier transform is 


g^k 2 - (1 - -)Pr -m 2 A g ltv 


= 9 


= 


¥ l k l 


) (k 2 ~ ™? A ) + (\r-) ^ k2 ~ £ m zi)- 


The inverse of this matrix gives the A fl field propagator: 


k 2 — m 2 A 
—i 

k 2 — m\ 


(21.16) 


(21.17) 


k' 2 — £m 2 A 

Notice that the transverse components of the A field and the component h 
of the Higgs field acquire the masses m A , nih that we found in Section 20.1. 
The unphysical components of A, the Goldstone bosons, and the ghosts all 
acquire the same gauge-dependent mass 


^ Dependence in Perturbation Theory 

Because the parameter £ was introduced only in the gauge fixing, we expect 
it to cancel out of all computations of expectation values of gauge-invariant 
operators and of 5-matrix elements. This cancellation can be proved to all 
orders in perturbation theory by using the BRST symmetry of the gauge- 
fixed Lagrangian.* Here, however, we will simply illustrate the cancellation of 
£ in a simple example. 


*See, for example, Taylor (1976). 
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Figure 21.1. Propagators of the gauge field, Higgs fields, and ghosts in the 
Abelian model with spontaneously broken symmetry. 


Figure 21.2. Diagrams contributing to fermion-fermion scattering at lead¬ 
ing order in the Abelian model with spontaneous symmetry breaking. 

Consider coupling a fermion to the spontaneously broken gauge theory 
through a chiral interaction: 

C f = - Xf{xp L 0tljR +ip R (t>*xp L ), (21.18) 


with D[, = <9 m + ieA^ as before. This is a stripped-down, Abelian version of 
the coupling of fermions to the weak interaction gauge theory. The fermion A 
receives a mass 


TO/ = A / 


t/2 


(21.19) 


from the spontaneous symmetry breaking. (This theory has an axial vector 
anomaly that would render loop calculations inconsistent, but we will analyze 
it only at the level of tree diagrams.) 

In this theory, the leading-order diagrams contributing to fermion-fermion 
scattering are those shown in Fig. 21.2. Notice that the contribution from the 
exchange of the unphysical particle a must be included, since this particle 
appears in the Feynman rules. The ghosts do not appear in this process until 
the one-loop level. Since the propagator of the physical Higgs particle h is 
independent of £, the cancellation of the £ dependence must take place between 
the transverse and longitudinal components of A fl and the Goldstone boson ip. 
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The graph with exchange of the Goldstone boson has the value 

iM v = u(p')j 5 u(p) ^ _^ m 2 u{k'Yfu{k). (21.20) 

The £ dependence of this expression must be canceled by that of the gauge 
boson exchange diagram, 


iM A = (-«e) 2 u(p') 7 /L<(^-y-)w'(p) 


q 2 — m A 


(<T 


q»q v 




q 2 - im\ ' 

( 21 . 21 ) 

The £ dependence of this term looks quite intricate. However, we can make 
some simplifications by rewriting the gauge boson propagator as 

1 


—i 


9 2 

q z — m 4 


[9‘ 


,fXV 


q»q v 


+ ?y[-1 

L my 


<? 2 “ 


(<T - 

\ m , / 


g 2 — 


+ 


7i ' <f ~ £rn A V m A 

The first term of (21.22) is ^-independent. The second term can be simplified 
in (21.21) by using the identity 

- la(p')[(y- /1 (;> /b 5 ] u (p) 


M) 

(&)■ (».*> 


- -u{p')[/7 5 + V 

= nifU (p 1 ) 7 0 u {p ), 


(21.23) 


and the analogous identity on the other fermion line. After making these 
rearrangements and inserting the explicit values to/ = Xfv/V2 and to ,4 = eu, 
the gauge boson exchange amplitude (21.21) takes the form 

=(-ie) 2 u(p') (^y-)w(p) - ^-)«(fc')7 v (^y-)«(*:) 

+ (^=) s (/''5" ""</') ^"< k' W‘u (/.:)• (21.24) 

The second term of (21.24) precisely cancels the Goldstone boson exchange 
diagram (21.20). The terms that remain in the fermion-fermion scattering 
amplitude are independent of £. 

This demonstration merits two additional comments. First, throughout 
this book, we have become accustomed to dotting the gauge boson momen¬ 
tum into a gauge boson vertex and finding zero or contact terms. However, in 
spontaneously broken gauge theories, we typically find a different result. The 
fermionic current ^ 7 M (1 —”f 5 )ip is not conserved, with the nonconservation be¬ 
ing proportional to the fermion mass. This allows the manipulation (21.23) to 
contribute terms proportional to the Higgs boson vacuum expectation value, 
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which interplay with the Goldstone boson contributions. We will discuss this 
point further, and find a physical application of it, in Section 21.2. 

The second point concerns the final form of the gauge-invariant sum of 
the gauge boson and Goldstone boson exchange diagrams. These give just the 
result we would have found by neglecting the Goldstone boson and computing 
the gauge boson exchange using the first term of (21.22) as the propagator: 

(A,(q)AA-q)) = )■ (21.25) 

q- — m A V m A ) 

The tensor structure represents a gauge boson polarization sum. To identify 
what vectors are summed over, notice that, if the vector boson is on-shell, and 
if we boost to its rest frame, this structure becomes precisely the projection 
onto the three purely spatial directions. These are the three polarization states 
of an on-shell massive vector particle. In a general frame, still for q 11 on-shell, 
the tensor in (21.25) remains the projection onto physical polarization states: 


^ e /x e v * 

d J q IJ = 0 



q> t q v \ 
m A > ' 


(21.26) 


Thus, in the cancellation of the ^-dependent parts of the gauge boson propa¬ 
gator, we also find that the Goldstone boson diagram cancels the contribution 
of the unphysical timelike polarization state of the gauge boson, leaving over 
the required three physical polarizations. 

The perturbation theory rules that we have developed have a very differ¬ 
ent character for different values of £. Thus, it is even more true in the case of 
spontaneously broken symmetry that we can find different special simplifica¬ 
tions by choosing different values of this gauge parameter. For £ = 0, Lorentz 
gauge, the Goldstone boson is massless and has exactly the couplings it has in 
the ungauged model of symmetry breaking, while the gauge boson propagator 
is purely transverse: 




k»k v \ 

~k r ) ] 


= 4 - ( 21 - 2 7 ) 


This gauge is especially useful for analyzing models of symmetry breaking. 
Both propagators have poles at k 2 = 0. However, we know that there are no 
corresponding physical particles, because these poles move away from k 2 = 0 
as we change £, while the 5-matrix must be ^-independent. 

For £ = 1, we recover the simple form of the gauge boson propagator given 
in (21.1). This choice of the gauge boson propagator is not consistent, however, 
unless we also include Goldstone boson exchanges in which the Goldstone 
boson is also assigned the mass tua- 

- ig» v _ i 

k 2 — m\ ' k 2 — m A ' 


(21.28) 
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This gauge, still called the Feynman-‘t Hooft gauge, is the most convenient 
one for general higher-order computations. 

For any finite value of £, the gauge boson and Goldstone boson propa¬ 
gators fall off as 1/k 2 and thus obey the general power-counting analysis of 
Section 10.1. It follows that, in any one of these gauges, the perturbation the¬ 
ory will be renormalizable, in the sense that the divergences are removed by 
a finite set of counterterms. Furthermore, the analysis of Section 11.6 tells us 
that the only counterterms required are those that are symmetric under the 
original global symmetry of the theory. However, we should require one fur¬ 
ther condition of our renormalization procedure: We should insist that the 
counterterms preserve local gauge invariance, and, in particular, preserve the 
property that 5-matrix elements and the matrix elements of gauge-invariant 
operators are independent of £. This result was proved to all orders in pertur¬ 
bation theory by ‘t Hooft and Veltman and by Lee and Zinn-Justin.* Thus, 
in the gauge defined by any finite value of £, we can, in principle, straightfor¬ 
wardly compute a physical quantity to any order. The gauges defined by the 
possible values of £ are known as the renormalizability , or R^, gauges. 

By taking the limit £ —1 oo of the R^ gauges, we find a gauge with very- 
different simplifying features. In this limit, the unphysical degrees of freedom, 
which have masses proportional to disappear from the theory. The gauge 
boson and Goldstone boson propagators become: 





= 0. (21.29) 


The gauge boson propagator contains exactly the three spacelike polarization 
states. In this gauge, the only singularities of Feynman diagrams correspond to 
the propagation of physical intermediate states. Thus, the unitarity of the S'- 
matrix follows from the Cutkosky rules, as in the globally symmetric theories 
considered in Section 7.3, without the need to worry about the cancellation 
of unphysical states.* The ( -> oo limit of the R^ gauges thus gives the 
quantum-mechanical realization of the unitarity (or U) gauge, introduced in 
Eq. (20.12). 

It is not straightforward to prove renormalizability directly in the U gauge. 
In this gauge, the gauge boson propagator falls off more slowly than 1/k' 2 at 
large k. This signals trouble for the evaluation of loop diagrams. Typically, in 
fact, individual loop diagrams will diverge as log£ or worse as £ —)■ oo. Still, the 
gauge invariance of the 5-matrix implies that these divergences must cancel 
in the sum of all diagrams contributing to a given process, so that this sum 
has a smooth limit as £ — > oo. There is no difficulty of principle with the 
fact that we use one gauge to prove the renormalizability of spontaneously 

*G. ‘t Hooft and M. J. G. Veltman, Nucl. Plivs. B50, 318 (1972), B. W. Lee and 
J. Zinn-Justin, Phvs. Rev. D5, 3121, 3137, 3155 (1972), D7, 1049 (1973). 

+In tlie more sophisticated language of Section 16.4, the crucial identity (16.54), 
which is required for the unitarity of the 5-matrix, is true manifestly. 
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broken gauge theories and another gauge to prove their unitarity. In fact, this 
method of argumentation makes natural use of the underlying symmetries of 
the theory. 

Non-Abelian Analysis 

Now that we have thoroughly examined the R% gauges for an Abelian gauge 
theory, we are ready to generalize to the non-Abelian case. There is no diffi¬ 
culty in being completely general, so let us consider a Yang-Mills gauge theory 
with gauge group G, spontaneously broken by the vacuum expectation value 
of a scalar field. 

We will build on our classical analysis of this system following Eq. (20.13). 
As in that analysis, it will be most convenient to write the scalars as a mul- 
tiplet cpi of real-valued fields. Then the gauge transformation of the cpi takes 
the form 

6<pi = -a a (to)T? J <p ji (21.30) 

where the are real, antisymmetric representation matrices of G. Similarly, 
the transformation of the gauge fields is 

SAl = U,a a - f abc a b A c fl = i(D^a)°. (21.31) 

(If the gauge group is not simple, the coupling g need not be the same for 
every a.) The Lagrangian invariant under these gauge transformations is 

£ = -\(Fp' 2 + \{D,cpf - y(</>), (21.32) 

with 

n tl O; 0,:Oi t gApTAR. (21.33) 

Assume that the potential V{(p) is minimized at a point where some of 
the components of <f> acquire vacuum expectation values. As in (20.16), define 

{(pi) = ( <Po)i • (21-34) 

We will expand (pi about this value: 

(pi{x) = (pot + \i(x). (21.35) 

It will be convenient to divide the space of values Xi into two subspaces. 
The vectors T a (p 0 correspond to symmetry transformations of the vacuum 
expectation value of <p. The field fluctuations along these directions are the 
Goldstone bosons. Let {n,} be an orthonormal basis for this subspace; then 
the unit vectors n, are in 1-to-l correspondence with the Goldstone bosons. 
The field fluctuations orthogonal to all of the vectors T a (p 0 correspond to the 
(massive) physical scalar fields of the spontaneously broken gauge theory. 

In the discussion to follow, the vectors T a (p 0 will play an important role. 
We should then recall the notation for these vectors that we introduced in 
Eq. (20.51): 
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The matrix F% is not generally square; it has one row for each gauge generator, 
and one column for each component of <f>. However, many of its elements are 
zero. Its nonzero elements connect the spontaneously broken gauge generators 
and the Goldstone bosons. In Eq. (20.56), we showed that the gauge boson 
masses generated through the Higgs mechanism can be written 

m ab = .'/-/'V'V (21-37) 


To give a concrete example of a matrix F a j , let us compute it in the GWS 
electroweak theory. Following the conventions introduced in Eq. (20.14), we 
should rewrite the Higgs field of the GWS model in terms of four real scalar 
fields. A convenient parametrization is 


<t> 


s/2 \v + (h + i<p 3 )) ‘ 


(21.38) 


The fields <p l are the Goldstone bosons, and h is the massive Higgs boson. The 
vacuum state is simply 

<k ' = M° 

The real representation matrices are 

T a = -iT a = -i— , T y = -iY = -i-. 

2 2 


A simple computation then shows, for instance, that T 1 (j> 0 equals v/2 times 
a unit vector in the q> 1 direction. Filling in the remaining components of F%, 
with a = 1, 2,3, Y and i = 1, 2,3, we find 


gF*i 




0 0 \ 

9 0 
o 9 ' 

0 —g'J 


(21.39) 


We do not need to include the components of F a , along the direction of the 
physical Higgs field h; the vectors T a <f> 0 are all orthogonal to this direction. 

If we insert (21.35) into (21.32) as a change of variables, we find, for the 
quadratic terms in the Lagrangian, 


£ 2 = -Ul{-g^d 2 F&‘d")Al + \{d, X ? 

+ g& t XiA%F a i + ±(mxr b A;A» b - ±M ijXiXj , 


(21.40) 


where (m 2 x ) ab is the gauge boson mass matrix (21.37) and 


d 2 

3 hi — q ; q ; 


■v(4>) 


(21.41) 


We proved in Eq. (11.13) that 
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for all possible directions rij in the subspace spanned by the T°<p 0 , so the 
Goldstone bosons are massless. 

To study the quantum theory of this system we start with the functional 
integral 

Z = j VAV X e i -S' C[A ’ x] . (21.43) 

Using the Faddeev-Popov gauge-fixing procedure, we define this integral, anal¬ 
ogously to (21.10), as 

Z = C‘ ■ jVAD X exp[ ijd 4 x{C[A , x ] ~ 4 (G) 2 )] det (^), (21.44) 

for an arbitrary gauge-fixing function G(A, X ). The IF gauges are defined by 
the choice 

G a = ^=(d,A^-^gF a iXi ). (21.45) 

Note that G involves only the components of x that lie in the subspace of the 
Goldstone bosons. 

The gauge-fixing term adds to the Lagrangian the following set of quad¬ 
ratic terms: 

(-iG 2 ) 2 = [\d»d")Al + gd,, .T' ,: / ', \, - ±& 2 1A \; 2 . (21.46) 

The term that mixes A “ and \i is arranged to cancel between (21.40) and 
(21.46). The final quadratic Lagrangian for the gauge and Goldstone boson 
fields is 

A = -±A; ([-<T3 2 + (1 - 1)S'‘S*']«J“» - g 2 F%F\)A b v 

+ r - ^g 2 F%F a jXiXj . (21.47) 

The mass matrices of gauge bosons and Goldstone bosons in this La¬ 
grangian are closely related to one another. The gauge boson mass matrix 
is 

(m^r 6 = g 2 F a iF b i = g 2 (FF T ) ab . (21.48) 

In an R j gauge, the timelike components of the gauge bosons acquire the mass 
matrix 

inr A =ig 2 {FF T ) ab . (21.49) 

At the same time, the Goldstone bosons acquire the mass matrix 

(m%)ij = GfF’tF'; = £g' 2 (F T F)ij. (21.50) 

The two matrices (21.49) and (21.50) have different numbers of zero eigen¬ 
values, but their nonzero eigenvalues are in 1-to-l correspondence. This is 
precisely the correspondence induced by the Higgs mechanism between the 
massive gauge bosons and the Goldstone bosons that they absorbed to gain 


mass. 
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Finally, we must construct the ghost Lagrangian. This is found from the 
gauge variation of the gauge-fixing term G a . Inserting (21.30) and (21.31) into 
(21.45), we find 

ftC 10, 1/1 \ 

^ {-(d,D'‘T b + £,g(T a (f>o) ■ T b (9 o + x)J • (21.51) 

Thus, the ghost Lagrangian is 

£ghost = «"| (O li n l r h - £g 2 (T a <t>o) ■ T b {cpo + x)]c 6 . (21.52) 

Notice that the ghosts have exactly the same mass matrix (21.49) as the 
unphysical components of the gauge bosons. This Lagrangian also contains 
both the familiar coupling of the ghosts to the gauge fields and the coupling 
to the physical Higgs fields that we found in the Abelian case (21.15). 

We have now computed the kinetic energy terms for gauge fields, scalar 
fields, and ghosts in an lh gauge. It is straightforward to convert these results 
to the calculation of propagators for these fields; the computations are exactly 
the same as in the Abelian case. We find for the three propagators 



All of these equations involve the matrix F defined in Eq. (21.36); the appear¬ 
ance of a matrix in the denominator should be interpreted as a matrix inverse. 
The scalar field propagator also includes the mass matrix (21.41) of the physi¬ 
cal Higgs bosons. There is no conflict between this matrix and the mass matrix 
of the Goldstone bosons, since they project onto orthogonal subspaces. 

Although the preceding discussion has been extremely abstract, it is not 
hard to specialize to a particular example. So consider, once again, the GWS 
electroweak theory, for which the matrix F% is given by Eq. (21.39). 

The gauge boson mass matrix in the GWS theory is 



g 2 0 0 0 \ 

0 g 2 0 0 

0 0 g 1 -gg' 

0 0 -gg' g' 2 ) 


in agreement with Eq. (20.124). (The g on the left-hand side should be inter¬ 
preted as g' for the fourth component of F.) Diagonalizing this matrix gives 
the familiar relations (20.62). Thus, in the basis of mass eigenstates, the four 
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gauge-boson propagators decouple to give simply 


—i 


k 2 — m 2 




¥ l k 1 ' 
k 2 — £m 2 



(21.54) 


where m 2 is , m|, or, for the photon, zero. Notice that, for the photon, 
this expression precisely reproduces Eq. (9.58). 

The mass matrix of the Goldstone bosons in the GWS theory is 


rF T F = 


4 


g 2 0 0 \ 

0 g 2 0 j . 

0 0 g 2 +g l2 J 


These fields therefore have the propagator 

i 

k 2 — £m 2 ’ 


(21.55) 


with mr = m\ v for 0 1 and (f> 2 (the bosons eaten by the W±) and m 2 = m 2 z 
for 0 3 (the boson eaten by the Z). The field h(x), which is the physical Higgs 
field, propagates independently with a mass determined by the Higgs potential 
(and no factor of £ in the propagator). 

Finally, there are four ghost fields. According to Eq. (21.53), these have 
the propagator 


i 

k 2 — £m 2 ’ 


(21.56) 


with the same values of m 2 as the four gauge bosons. 

The Feynman rules for the interaction vertices of these particles are com¬ 
plicated to write out, due to the large number of possible combinations. How¬ 
ever, it is quite straightforward to generate these rules by expanding the weak 
interaction Lagrangian and reading off the vertices term by term. We will 
work out a few examples in the following section.* 


21.2 The Goldstone Boson Equivalence Theorem 

From the results of the previous section, we see that perturbative calculations 
in the R^ gauges involve intricate cancellations among unphysical particles. 
Sometimes, however, these unphysical particles can still leave their footprints 
in physical observables. In this section we will see that, in the high-energy 
limit, the unphysical Goldstone boson that is eaten by a massive gauge boson 
still controls the amplitude for emission or absorption of the gauge boson in 
its longitudinal polarization state. 


*Tlie complete Feynman rules for tlie weak-interaction gauge theory are given in 
Appendix B of Cheng and Li (1984). 
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Figure 21.3. The Goldstone boson equivalence theorem. At high energy, 
the amplitude for emission or absorption of a longitudinally polarized massive 
gauge boson becomes equal to the amplitude for emission or absorption of 
the Goldstone boson that was eaten by the gauge boson. 

When we introduced the Higgs mechanism for vector boson mass genera¬ 
tion, we pointed out that it involves a certain conservation of degrees of free¬ 
dom. A massless gauge boson, which has two transverse polarization states, 
combines with a scalar Goldstone boson to produce a massive vector parti¬ 
cle, which has three polarization states. When the massive vector particle is 
at rest, its three polarization states are completely equivalent, but when it 
is moving relativistically, there is a clear distinction between the transverse 
and longitudinal polarization directions. This suggests that a rapidly mov¬ 
ing, longitudinally polarized massive gauge boson might betray its origin as a 
Goldstone boson. The strongest version of this idea is expressed in Fig. 21.3: 
The amplitude for emission or absorption of a longitudinally polarized gauge 
boson becomes equal, at high energy, to the amplitude for emission or ab¬ 
sorption of the Goldstone boson that was eaten. Remarkably, this statement 
is precisely correct, as a consequence of the underlying local gauge invari¬ 
ance. This Goldstone boson equivalence theorem was first proved by Cornwall, 
Levin, Tiktopoulos, and Vayonakisd 

Formal Aspects of Goldstone Boson Equivalence 

The proof of the Goldstone boson equivalence theorem is based on the Ward 
identities of the spontaneously broken gauge theory. To give a complete proof 
of the theorem, we would have to construct and analyze these Ward identities 
in some detail. However, it is possible to understand the idea of the proof by 
examining the special case of the theorem in which a single massive vector 
boson is emitted or absorbed in a scattering process. The analysis of this 
special case requires only the relatively simple Ward identity satisfied by a 
current between on-shell states.+ 


ij. M. Cornwall, D. N. Levin, and G. Tiktopoulos, Phvs. Rev. DIO, 1145 (1974); 
C. E. Vayonakis, Lett. Nuov. Cim. 17, 383 (1976). For an illuminating discussion of 
tlie equivalence theorem, see B. W. Lee, C. Quigg, and H. Thacker, Phvs. Rev. D16, 
1519(1977). 

+For a careful derivation of the equivalence theorem, including processes involving 
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To prepare for a discussion of longitudinal vector bosons, we need some 
simple kinematics. A vector boson at rest has momentum k M = (to, 0,0,0) 
and a polarization vector that is a linear combination of the three orthogonal 
unit vectors 

(0,1,0,0), (0,0,1,0), (0,0,0,1). (21.57) 

If we boost this particle along the 3 axis, its momentum boosts to k fl = 
(i?k,0,0,fc). The three possible polarization vectors are now the three unit 
vectors satisfying 

kfi =0, e 2 = -1. (21.58) 

Two of these are the first two vectors in (21.57); these give the transverse po¬ 
larizations. The third vector satisfying (21.58) is the longitudinal polarization 
vector 

<£(*)=(-, 0.0,—), (21.59) 

to to 

which is the boost of the third vector in (21.57). An important and somewhat 
counterintuitive feature of (21.59) is that it becomes increasingly parallel to 
k M as k becomes large. In fact, component by component, 

u> 

£ >[{k) = — + 0(m/Ek) (21.60) 

as k —> oo. Since the components of k fl are growing as k, this statement is 
consistent with the requirement that cl • k = 0 while k ■ k = m 2 . 

With this kinematic situation in mind, let us analyze the Ward identity 
satisfied by a gauge current matrix element between on-shell states. It is sim¬ 
plest to work in Lorentz gauge (£ = 0), where the gauge-fixing term (21.45) 
does not involve the Goldstone boson fields. The Ward identity can then be 
written as follows: 


(21.61) 


In the last expression we have written the matrix element as the sum of two 
pieces. First, the current can couple directly into a one-particle-irreducible 
vertex function T' J (k). This gives the class of diagrams that contribute to the 
scattering of a gauge boson from the external states. However, for a sponta¬ 
neously broken gauge theory, there is an additional term, which is not one- 
particle-irreducible, in which the current creates a Goldstone boson and it is 
this particle that couples to the external states through a 1PI vertex T(k). 

Let us write the relation linking the gauge current and the Goldstone 
boson state as 

<0| .J tl \n{k)) =-iFk tl , (21.62) 


multiple absorptions and emissions of massive vector bosons, see M. S. Chanowitz and 
M. K. Gaillard, Nucl. Phvs. B261, 379 (1985). 
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as in Eq. (20.46). Then the argument leading to Eq. (20.56) tells us that the 
gauge boson mass is given by 

m = gF, (21.63) 

where g is the gauge boson coupling constant. 

With these identifications, we can write the Ward identity that follows 
from the conservation of the gauge current: 

K <J") = 0, (21.64) 

between on-shell states. Writing each term shown in (21.61) in terms of the 
appropriate one-particle-irredicible vertex function, we find 

k^(k) + k„{igFk»)^r(k) = 0. (21.65) 

Thus, 

k v T^{k) =mT(k). (21.66) 

Now use this equation in the limit of large gauge boson momentum. Since the 
gauge boson vertex is one-particle-irreducible, the momenta of propagators 
inside the vertex are not, in general, collinear with k 11 . Then, according to 
(21.60), we may replace k 11 /m by the longitudinal polarization vector. Notice 
that this would not be permissible (but, also, is not necessary) in the second 
term of (21.65). Our final result is 

e L ^(k)F‘(k) = r (k), (21.67) 

as k —> oo, with an error of order m 2 /k' 2 . That is, in the high-energy limit, 
the couplings of longitudinal gauge bosons become precisely those of their 
associated Goldstone bosons. 

The equivalence theorem can be derived in another way, using the count¬ 
ing of physical states in spontaneously broken gauge theories, which we dis¬ 
cussed below Eq. (21.26). In the previous section, we saw that, at least at 
the tree level, unitarity is maintained in spontaneously broken gauge theories 
by the cancellation of diagrams that produce timelike-polarized gauge bosons 
against diagrams that produce Goldstone bosons. 

The situation is most clear in Feynman-‘t Hooft gauge. There, the nu¬ 
merator of the gauge boson propagator is — g ,lv . We can write this in terms 
of polarization vectors as 

__ UJ'LV 

-<r= E (2i.68) 

*=1,2,3 m 

The last term is the contribution from unphysical timelike polarization states. 
The unitarity of the 5-matrix requires that, when a Cutkosky cut through a 
diagram puts a gauge boson propagator on-shell, the contribution of this piece 
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Figure 21.4. Decay of a t quark into W + + b. 

must be canceled by a Cutkosky cut that runs through a Goldstone boson line. 
The required cancellation is 

+ |r(fc)| 2 = 0, (21.69) 

1 TO 11 1 

or, diagrammatically, 


Once again, since T^(k) is a one-particle-irreducible vertex, we can use (21.60) 
to replace (k^/m) by the longitudinal polarization vector e^{k) for a high- 
energy gauge boson. Then (21.69) becomes just the square of (21.67). 

Through these formal arguments, we can see, at least to the tree level 
in processes with single gauge boson emission, that the equivalence theorem 
must be valid. However, it is much more illuminating to see the equivalence 
theorem at work in explicit calculations for interesting physical processes. We 
will now illustrate its influence in two examples. 

Top Quark Decay 

The first example is the weak decay of the top quark. This charge +2/3 quark 
is sufficiently heavy that it can decay to a real W + through t —>- W + +b. The 
diagram for this decay is given by the simple gauge vertex shown in Fig. 21.4. 

Let us first try to guess the magnitude of the top quark width. The squared 
matrix element will contain a factor of g 2 , times some expression with dimen¬ 
sions of mass. Since the width should be large if the top quark mass is heavy, 
a first guess might be 

cr 

T ~ — m t . (21.70) 

47T 

The correct expression, however, turns out to be enhanced by a factor of 

(■ mt/mw ) 2 - 

The amplitude for this decay can be read from Eq. (20.80): 
iM = ^u(q)Y‘ (^-^-^ju(p)el(k). 


(21.71) 
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(We set the relevant CKM factor equal to 1.) We will now turn this amplitude 
into an expression for the decay rate of the top quark. For simplicity, we will 
ignore the mass of the b quark in this computation. 

Squaring the amplitude in (21.71) according to our standard methods, 
and then averaging over initial and summing over final spins, we find 

| E l-^l 2 = y[^+?V-A-p] E (21.72) 

spins polarizations 


We can sum explicitly over physical gauge boson polarizations by inserting 
the expression (21.26) for the polarization sum. This gives 


iEi^i 2 4[*"+ 


5 V - (f v q ' 


p] -g t ,v + 


Kk v 


w 


g 


Q 


= Y' q ' p + 2 


(k ■ q)(k ■ p) 


HV 


For m/, = 0, 

2q ■ p = 2q ■ k = m] — m\ v , 2k • p = m 2 + m\ v . 

Then 


iEi- v| i s = 


spins 


2 4 

<T ml 
4 m\ v 


(l rn h) 

(l + 2 m |') 

V m\ ) 

V mf ) 


After multiplying by phase space, we find 


r = 


g 2 m| 

647T m? w 


V m; / V mj ) 


(21.73) 


(21.74) 

(21.75) 


(21.76) 


This is larger than our initial estimate (21.70) by a factor (m t /mw) 2 ■ 

It is not difficult to find the origin of this enhancement, by using the Gold- 
stone boson equivalence theorem. In the gauge theory of weak interactions, 
the top quark obtains its mass from its coupling to the Higgs sector. The re¬ 
lation between the top-Higgs coupling X t and the top quark mass is written 
in Eq. (20.103). The top quark can be heavy only if A t is large. But then the 
amplitude for the top quark to decay to a Goldstone boson will be enhanced 
above (21.70) by the factor 


A? _ mi 
g 2 2 „q r - 


(21.77) 


which is in fact the enhancement we found in (21.76). 

To make the comparison more precise, we will now compute the prediction 
of the equivalence theorem for the top quark decay rate into a longitudinally 
polarized W + boson. Recall from (20.101) that the term in the weak interac¬ 
tion Lagrangian that couples t and b to the Higgs field is 

AC = -X t e ab Q La 4>\t R + h.c. 


(21.78) 
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Figure 21.5. Decay of a t quark into a Goldstone boson and a b quark. 
Decompose the Higgs field as in (21.38), and write 

#=^ 1 ±# 2 )- (21-79) 

These are the fields of the charged Goldstone bosons that are eaten by the 
Hd ± . Including the Goldstone boson in the theory adds a process t —> cp + + b, 
shown in Fig. 21.5. This process is mediated by the Lagrangian term 

AT = \ t b L <P + t R , (21.80) 


which leads to the decay amplitude 

iM = i\ t u(q ) (21.81) 

From this expression, we easily find 

1 £ \M\ 2 =\U"P- (21-82) 

spins 


If we now ignore the mass of the Goldstone boson, or, equivalently, consider 
the limit nit niw, we find for the top quark decay rate 


r = 


2L mt = jLj£' 

327T 647T ru^ v 


(21.83) 


in agreement with the leading term of (21.76) in this limit. Our results imply 
that only the production of the longitudinal polarization state of the W + is 
enhanced; this is easily checked directly by substituting explicit polarization 
vectors into (21.72). 

In our derivation of (21.76), we summed over the physical polarization 
states of the emitted W + ; one might say that we used the prescription of 
the U gauge to sum over polarizations. We could equally well have used the 
prescription of Feynman-‘t Hooft gauge, replacing 


J2e;(k)eAk)^-g,„, (21.84) 

i 

and also adding the contribution of the Goldstone boson emission diagram, 
treating the Goldstone boson as a massive particle with mass mw ■ With these 
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prescriptions, the gauge boson matrix element gives 

I E l^l 2 = f ( 2 Q-P) = Y {m 'l ~ (2L85) 

spins 

The Goldstone boson emission diagram gives 

3 E l' M ! 2 = ^ Q‘P= ~ m w)- (21.86) 

spins vv 

The sum of these contributions indeed reproduces (21.75) and thus gives the 
same result (21.76) for the total decay rate. In Feynman-‘t Hooft gauge, the 
enhancement due to the large coupling of the top quark to the Higgs sector 
shows up explicitly in the Goldstone boson emission contributions to the total 
rate of W + production. 


e+e~ W+W~ 


Our second example is more complicated, but also contains more interesting 
physics. This is the reaction e + e _ —>■ W + W~. In this reaction, the equiv¬ 
alence theorem does not lead to an enhancement of the cross section, but, 
rather, directs a cancellation between Feynman diagrams. As we will see, this 
cancellation is essential for the internal consistency of the theory. 

In Problem 9.1, we computed the cross section for e + e _ annihilation into 
a pair of charged scalar particles, as in Fig. 21.6(a), and found the result 


da 

d cos 0 


(e + e 


->■ <P + $~) 


o 

ROT 

4 s 


(21.87) 


at energies much larger than the scalar mass. Just as for e + e _ annihilation to 
fermion pairs, this cross section falls as 1/s at high energy. It can be shown 
that this behavior is required by unitarity: Since the electron and positron 
annihilate through a pointlike vertex, the annihilation takes place in only one 
partial wave. Unitarity puts a limit on the amplitude in this partial wave, 
requiring that M be bounded by a constant, and thus that a be bounded by 
1 /s at high energy.* 

The same unitarity argument applies to e + e _ annihilation to vector 
bosons. Here, however, it is much less obvious that Feynman diagrams ac¬ 
tually produce a cross section consistent with unitarity. Consider the con¬ 
tribution of Fig. 21.6(b). We would expect that the square of this diagram 
should contain a contribution to the cross section of the form of the scalar 
contribution (21.87) multiplied by the dot product of polarization vectors: 


da 

d cos 6 


(e + e 


-4- W + W~) ~ 


2 

'ROT 

45 


e(k + ) ■ e(k-) , 


( 21 . 88 ) 


*Partial-wave analysis for relativistic collisions is discussed in Perkins (1987), 
Chapter 4. 



21.2 The Goldstone Boson Equivalence Theorem 751 


Figure 21.6. Electron-positron annihilation through a virtual photon (a) to 
charged scalar bosons, (b) to W bosons. 


where k + and k- are the momenta of the outgoing W bosons. For transversely 
polarized W bosons, this term is well behaved, but for longitudinally polarized 
W’s it leads to problems. Using the approximation (21.60) for the longitudinal 
polarization vectors, we find 

e(k+) ■ e(k-) -» ■» - 5 - (21.89) 

for s '>> m\ v . This leads to a cross section that grows much faster than is 
allowed by unitarity. In principle, the cross section could be brought back down 
to a proper behavior by the addition of contributions from higher orders in 
perturbation theory, but this would be a most unpleasant resolution. It would 
imply that the theory of W bosons becomes strongly coupled at energies such 
that 



corresponding to center-of-mass energies of order 1000 GeV. But if the theory 
of W bosons is strongly coupled at short distances, it is hard to understand 
why, at large distances, it should become the simple, weak-coupling theory 
that we observe. 

Fortunately, there is another possible resolution of this problem. In the 
weak interaction gauge theory, there are three Feynman diagrams that con¬ 
tribute to the amplitude for e + e _ —t W + W~ at the tree level; these are 
shown in Fig. 21.7. Each diagram separately produces a cross section that 
grows in the same manner as (21.88). However, it is possible that the badly 
behaved terms might cancel among the three diagrams, leaving a more proper 
high-energy behavior. If this miraculous cancellation were to occur, it would 
allow the theory of W bosons to be consistently weakly coupled up to very 
high energies. 

Although such a cancellation seems unlikely at first sight, it is actually 
required by the Goldstone boson equivalence theorem. The theorem states 
that, at high energy, the cross section for producing longitudinal W bosons 
should be equal to the cross section for producing the corresponding scalar 
Goldstone bosons. But we know that scalar cross sections behave as 1/s, as 
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Figure 21.7. Diagrams contributing to e + e —> W + W in the weak inter¬ 
action gauge theory. 

indicated in (21.87). Thus, somehow, the gauge boson cross section must also 
conspire to produce this result. We will now show this explicitly. We will see 
that the required cancellations are directed by the Ward identities of the gauge 
theory. 

To prepare for this calculation, we need the Feynman rules for the vertices 
shown in Fig. 21.8. The Feynman rules for the couplings of the electron to W, 
Z, and 7 can be read directly from (20.80). The relative strengths of these 
couplings are determined by the SU('2) x U(l) quantum numbers of the left- 
and right-handed components of the electron. It is equally straightforward to 
construct the couplings of the Goldstone bosons to Z and 7 . Since the boson 
(j) + has electric charge 1, the photon coupling is just that found in Problem 9.1. 
The Z coupling is determined with the additional information that the q> + has 
I 3 = +1/2. All of these expressions are shown in Fig. 21.8. 

The three-gauge-boson vertices that appear in Fig. 21.7 arise from the 
cubic terms in the gauge field action. Since the U( 1) field strength is linear in 
gauge fields, these come only from the kinetic term of the SU( 2 ) gauge field. 
To identify the specific pieces we need, we must rewrite this cubic term in the 
basis of mass eigenstates given by (20.63) and (20.64). This can be done as 
follows: 


~j(Fp' 2 -)• -kd„ A« - d^ge^A^A^ 

= -g(d fJ Al - d,Al)A^A^+g(d,Ai - d^A^A" 3 
-g{d,Al-d v A 3 ,)A tl A^ 

= ig[(d,w+ - (/.it); 111 " ,r 3 - (d M w~ - d l/ w-)w ft+ A ‘' 3 
+ \{d„Al - d„Al )(W ft+ W'- - W l '-W' + )\. (21.91) 

Finally, inserting A 3 = cos 0 w Z fl + sin^A^ and g = e/ sin0 ? „, we find the 
Feynman rules shown in Fig. 21.9. 

Before examining the amplitude for e + e _ annihilation to vector boson 
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Figure 21.8. Feynman rules of the weak-interac.tion gauge theory for elec¬ 
trons and scalars coupling to photons and Z bosons. 


Figure 21.9. Feynman rules of the weak-interaction gauge theory for WW 7 
and WWZ vertices. 

pairs, we will first work out the amplitude for production of a pair of charged 
scalars. The equivalence theorem predicts that the amplitude for production 
of two longitudinal W bosons should become equal to this amplitude at high 
energy. Assembling vertices from Fig. 21.8, we find that, for an electron of 
either helicity, the amplitude to annihilate to scalars through a virtual photon 
is 

iM(ee —> 7 * —> 0 + (p~) = ie 2 v 0 ll u—{kj t — k-) l '\ (21.92) 

where k + , k- are the momenta of the scalars and q = k + + k-. The cor¬ 
responding amplitude for annihilation through a virtual Z° depends on the 
e + e _ helicities. Adding these contributions to the preceding expression, we 
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1 sin 2 9 W )~ 


■ + 


q 2 ' sin 2 8 W cos 2 8 W q 2 —m 2 z 
1 (4 — sin 2 8 W ) 1 


find 

iM(e R e R -> d> + <7 _ ) = ie 2 v.L'y^ t u L 

iM(e R eY -t ^ + ^“) = ie 2 v R 'y ii u R . 

|_ (/ GUS - (/uj t / _ — ,n Z 

(21.93) 

Notice that, in the high-energy limit, the amplitude for the annihilation of 
right-handed electrons cancels down to 


cos 2 8 W q 2 —m] 


{k+-k-Y. 


iM(i [j 1 f > 0 + 4> ) > i 


e 2 1 

VR0 t ,UR — (k+-k-y‘, (21.94) 


2 cos 2 6 


which is just the amplitude for an e R , with Y = —1, to couple to a 0 + , 
with Y = 1/2, through the 17(1) gauge boson //, with coupling constant 
g‘ = e/ cos 8 W . This expression reflects the fact that the e j) has no direct 
coupling to the SU( 2) gauge bosons. Similarly, the amplitude for left-handed 
electrons tends to 


/.V4(<’ f i h , (p + (t> ) —> ie 2 


1 1 

+ 


4 cos 2 8 W 4 sin 2 0,, 


VL7,,u L -^(k + -k-Y 


(21.95) 

in the high-energy limit. This has the structure of a coherent sum of ampli¬ 
tudes with B tJ and .4/ exchange. In just the way that we saw in Chapter 11, 
the symmetry structure of a gauge theory with spontaneously broken symme¬ 
try is recovered in the high-energy limit. 

Now let us compare these results to a direct calculation of the W + W~ 
production amplitude in the weak interaction gauge theory. Begin with the 
case of an initial e R . Since the coupling of the electron to the W~ is purely 
left-handed, the third diagram of Fig. 21.7 vanishes in this case, so the com¬ 
putation is a bit easier. The first two diagrams of Fig. 21.7 have exactly the 
same structure and sum to 


iM(e R ej -s> W + W ) = ® R 7 \u R 


. . — i . ie sin# 
( — ie) — (ie) + 


—i iecos8„ 


cos 8 W q 2 —m z sin 8 U 


[g^(k-~k + ) X + g^(-q-k-Y +g X Yk + +qy]z;(k+K(k-)- 


(21.96) 

This equation is valid in any of the gauges, since, if we ignore the electron 
mass, 


q x i'in\u R = o. 


(21.97) 


The second line of Eq. (21.96) contains the enhancement for longitudinal 
W bosons mentioned above. If we approximate the longitudinal polarization 
vectors by (21.60) and drop terms that do not grow as s —»• oo, this line 
becomes 


[<T(fc- - K) x + 9 X Y-q - k-Y’ + 9 X,J (k + + qY] 


k-\-n k—i> 


7Th\v 7Tl\v 
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= -^\k + -k-(k--k + ) x -2k--k+ki + 2k+-k-k x ] + 0(1) ■ (k--k+) x 

nr w 

= ^-(k+-k-) x + ■ ■ ■. (21.98) 

On the other hand, the expression in brackets in the first line of (21.96) cancels 
almost completely, to 

. 2 / 1 1 \ .2 m z 

26 ( ~ — — 1 ^ ■; • 

\q- q~—m z J q~(q~—m z ) 

Using both of these simplifications, we find 

2 

iM(e^e+ W+ W£) = vrJxUr [(*e 2 )^f ] ^4-(fc+ ~ k -) X ■ (21.99) 

By inserting the relation mw = mz cos 9 W , we see that this amplitude is 
identical to (21.94), as required by the equivalence theorem. 

For the amplitude with an initial e)), the computation is somewhat more 
involved. Now all three diagrams of Fig. 21.7 contribute, and since the last 
diagram has a different kinematic structure, it will be less clear how the dia¬ 
grams combine together. In what follows, we will demonstrate the cancellation 
of the unitarity-violating enhanced terms, and we will indicate how the terms 
one order smaller in m\ v /s assemble into the correct structure. However, we 
will not account rigorously for all of these smaller terms. The full calculation 
of these diagrams is the subject of Problem 21.2. 

For the case of an initial ejj, the first two diagrams of Fig. 21.7 sum to 
the expression 


= V L J\U L 


(~ie) — (ie) + 


= (-|+ sin 2 6 W ) i. 


: cos 


sin 6 W cos 6 W q 2 — m z z sin 9 U 


■ [g^{k--k + f + g Xv (-q-k.Y + g x »(k + +qy]e;(k + K(k-), 

( 21 . 100 ) 

which differs from (21.96) only in the coupling of the electron to the virtual 
Z°. For longitudinal W bosons, we can simplify this expression as we did 
(21.96), obtaining 


= v L j\UL(ie 2 ) 


mj 


s(s—m%) 


2 sin 2 6 W s-m 2 z 


2 m w 


( k + -k-Y 


( 21 . 101 ) 

The second term in brackets is a potentially dangerous contribution, which 
must be canceled by the diagram with t-channel neutrino exchange. This 
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diagram has the value 

= {q=) vlY l ^_ ^~l Yu L (()el(k + )el(k-), (21.102) 


where l is the initial electron momentum. Approximating the longitudinal 
polarization vectors as before, we have 



¥+ (?-¥-) ¥- 
mw (i ~ k -) 2 mw 


ul((). 


(21.103) 


Now we manipulate this expression as if we were proving a Ward identity. 
Using the fact that u /, ((') satisfies the Dirac equation, 

(f- #_)#_«£(*) = -(f- ¥-] 2 u l (C) = -(£ ~ k-) 2 u L (e), (21.104) 

expression (21.103) reduces to 

= (21.105) 

Finally, using Eq. (21.97), we can rewrite this expression as 

= ie 2 .\ - k-) x . (21.106) 

2 sin 9 W Zm\ v 


This term cancels the dangerous high-energy behavior of (21.101). To see 
that the sum of diagrams has the correct high-energy limit, however, the 
approximations that we have used are not quite adequate. In particular, the 
correction to relation (21.60) for the polarization vectors is of order niyy/s 
and must be taken into account. When all of the corrections of order niyy/s 
are included, it turns out that the sum of the .s channel diagrams (21.101) is 
unchanged, while the expression for the neutrino exchange diagram (21.106) 
is multiplied by the factor (1 + 2 m 2 v /s). Then the sum of all three diagrams 
gives 


iM(e L eJ i ->• W£W L ) = /.(/.•- - fc_) 


1 


1 


+ 


1 


2 cos 2 0 W 4 cos 2 0 W sin 2 6 W 2 sin 2 6 


(21.107) 


The middle term in brackets cancels half of each of the other two terms, to 
give an expression that agrees precisely with Eq. (21.95). 
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Figure 21.10. The differential cross section for -5- W + W ~, in units 

of R (Eq. (5.15)), at E cm = 1000 GeV. The various curves show the contri¬ 
butions to the total from individual helicitv states of W~ and W + ; these are 
denoted (h-,h+), where each lielic.ity takes the values (+,—,0). The contri¬ 
butions from the (+, +) and (—, —) states are too small to be visible. Notice 
that both the cross section, denoted (0,0), and the (+, —) cross 

section become proportional to sin 2 0 at very high energy. 


The calculation of Problem 21.2 gives for the complete annihilation am¬ 
plitude 


iM(e L eX — >W^W L ) = ie 2 t:i,'\ni,U' ■ — k-) x - 


1 ( s / m| \ 2 

2 sin 2 9 W \ s - m| V 2 m\ v + / fP 

_ 8m 2 w \ m% ( \s+m 2 w \ 

sfP(l+fi 2 —2f]cos6)j s — m 2 z ) 


(21.108) 


where /3 = (1 — Am^/s ) 1 / 2 is the W boson velocity. The high-energy limit 
of this expression indeed reproduces (21.107). The contributions to the dif¬ 
ferential cross section for e£ep —> W + W~ from this and the other possible 
helicitv states are plotted in Fig. 21.10. 

These cancellations among the diagrams of Fig. 21.7 occur by virtue of 
the Ward identities of the gauge theory. That is, they occur only because 
the theory has an underlying local gauge invariance. At the beginning of our 
discussion, we argued that these cancellations are necessary to insure that 
the theory remains, in a consistent way, weakly coupled up to arbitrarily 
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high energy. In Section 20.1, we showed that one can generate masses for 
vector bosons by spontaneously breaking local gauge invariance. We have now 
argued the converse of that result: that the only theories of massive vector 
bosons that do not have violent high-energy behavior are those that result 
from spontaneously broken gauge theories.t 

21.3 One-Loop Corrections to the Weak-Interaction 
Gauge Theory 

The final topic in our study of spontaneously broken gauge theories is the 
computation of one-loop corrections in the weak-interaction gauge theory. 
As we discussed in Section 20.2, tree-level diagrams produce a number of 
intricate predictions for the couplings of the Z° and the cross sections for 
neutral current reactions. In general, these predictions are modified by the 
effects of one-loop diagrams. In this section we will study some examples of 
these one-loop corrections. 

As in any renormalizable field theory, the one-loop diagrams of the elec- 
troweak gauge theory are typically ultraviolet divergent. These divergences 
can be absorbed by adjusting the underlying parameters of the theory. These 
adjustments define a set of counterterms which, by renormalizability, render 
the full set of one-loop diagrams of the theory finite. Those amplitudes that 
are not adjusted by hand then become predictions of the theory. 

In Chapter 11, we saw that this general procedure, which applies to any 
renormalizable field theory, gives especially rich information when applied to a 
theory with spontaneous symmetry breaking. In a theory with spontaneously 
broken symmetry, the amplitudes of the theory vary markedly for different 
particles in the same multiplet of the original symmetry. However, the coun¬ 
terterms of the theory respect the symmetry relations. Thus, the adjustment 
of an amplitude for one particle leads to definite predictions for other particles 
that are not related by any manifest symmetry. 

Theoretical Orientation, and a Specific Problem 

At the end of Section 11.6, we presented a useful framework for organizing 
calculations of the predictions of renormalizable theories with spontaneous 
symmetry breaking. We defined a zeroth-order natural relation to be a rela¬ 
tion among observable quantities in the theory that is true for any values of 
the parameters in the Lagrangian. Since the counterterms of the theory shift 
the values of the underlying parameters without adding new terms, a zeroth- 
order natural relation will not be corrected by these counterterms. Thus, if the 
theory is renormalizable, the one-loop corrections to a zeroth-order natural 


fTliis statement is proved systematically in tlie paper of Cornwall, Levin, and 
Tiktopoulos cited at the beginning of this section. 



21.3 One-Loop Corrections to the Weak-Interaction Gauge Theory 759 


relation will be finite, and will in fact be definite predictions from the quan¬ 
tum structure of the field theory. Though we discussed this idea originally in 
theories with spontaneously broken global symmetry, it applies equally well 
to theories with spontaneously broken gauge symmetry. In this section, we 
will apply this idea to derive finite one-loop corrections to relations in the 
weak-interaction gauge theory. 

It is easy to find zeroth-order natural relations in the electroweak theory. 
The leading-order predictions given in Section 20.2 involve a relatively small 
number of free parameters. Many of these predictions are made for energies at 
which the quark and lepton masses can be ignored; then they depend only on 
the coupling constants g and g' and the vacuum expectation value v , which 
sets the scale of spontaneous symmetry breaking. The remaining ingredients 
of the weak-interaction theory are given in terms of these parameters; for 
example, 


m w = g 


2 ’ 


e = 


gg 


Vg 2 +g r2 


mz = vV +0' 2 |, 

Gf _ g 2 _ 1 
\/2 8 2v' 2 ' 


(21.109) 


Even in this set of quantities, we have four relations that depend on three 
underlying parameters, so there is one relation of observable quantities that 
is independent of the parameters of the Lagrangian. 

Since many of the predictions of the weak interaction gauge theory are 
determined by the parameter sin 2 6 W , it is useful to define sin 2 6 W in terms 
of observables and then use this definition as a basis for constructing natural 
relations. In our discussion of the precision tests of electroweak theory in 
Section 20.2, we used the definition 

4 , = i -^4 (2i.no) 

m z 

as a standard for comparison of different experiments. But since the three 
most accurately known weak-interaction observables are a, Gf, and mz, it is 
useful to construct another physical definition of sin 2 6 W based on these three 
quantities. Define 9q such that 


sin 2^o = 


47TQ!* 
\[2 Gpm 2 z 


1/2 


( 21 . 111 ) 


where a* is the running coupling constant of QED evaluated at the scale Q 2 = 
m%. The renormalization group insists that it is the value of the electric charge 
at the weak-interaction scale that enters precision electroweak predictions, 
and this observation is confirmed by summing radiative correction diagrams 
involving light quarks and leptons. The current best values of the quantities 
in Eq. (21.111) give 


So = sin 2 6> 0 = 0.2307 ± 0.0005. 


(21.112) 
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Thus, this quantity provides a very accurate standard of reference. 

Once Eq. (21.111) is taken to define a reference value of sin 2 #^., the 
equations of Section 20.2 that connect sin 2 6 W to other observables become 
zeroth-order natural relations. For example, the tree-level equations 


A e _ (b - sin 2 9w)' 2 ~ (sin 2 8 W ) 2 
LR (i - sin 2 8 W ) 2 + (sin 2 8 W ) 2 


(21.113) 


are natural relations linking four observables of the weak interactions. The 
corrections to these relations will be well-defined predictions of the theory. 

In principle, we could now compute all of the one-loop diagrams that 
correct the parameters mw , mz, Gf, a , and A e LR . However, this is a very 
complicated exercise, requiring an extensive technical apparatus.* In this sec¬ 
tion we will focus on radiative corrections from one simple source that can 
be considered independently. Aside from the question of anomalies, the elec- 
troweak theory does not restrict the number of quark or lepton generations. 
Thus, it is sensible, and gauge invariant, to compute the one-loop corrections 
due to one quark or lepton doublet. For definiteness, we consider the effects 
of the ( t,b ) quark doublet. 

By focusing on the radiative corrections due to heavy quarks, we dramat¬ 
ically simplify the calculational task before us. The various observables of the 
weak-interaction gauge theory are extracted from the measurement of scatter¬ 
ing amplitudes with light fermions, leptons or quarks, in the initial and final 
states. For example, Gf is measured from the strength of a low-energy weak- 
interaction process, usually chosen to be the rate of muon decay: fj, —»• v l ,e~v e . 
For any such process, there are one-loop corrections of many kinds, as shown 
in Fig. 21.11. In addition to corrections to the vector boson propagator, there 
are vertex corrections, box diagrams, and diagrams with real photon emis¬ 
sion. In general, the contributions of the various classes of diagrams are not 
gauge invariant; rather, gauge invariance results from cancellations between 
the classes of diagrams in Fig. 21.11(b), (c), and (d). However, since heavy 
quarks do not couple directly to the light leptons, the (t, b ) doublet contributes 
only the single diagram shown in Fig. 21.11(f), which must be gauge invariant 
by itself. This same conclusion applies to the (t, b) correction to other leptonic 
weak interaction processes. If we ignore the CKM angles that mix the t and b 
with other species, the conclusion extends also to weak-interaction processes 
involving light quarks. 

A similar situation occurs with other species of particles, such as those 
of the Higgs sector. The coupling of Higgs sector particles to a light quark 
or lepton is proportional to the fermion’s mass, which we can often ignore. 
Thus the most important contributions from Higgs-sector particles are prop¬ 
agator corrections. The case in which the spontaneous symmetry breaking is 
produced by a single scalar field <j> is particularly straightforward to analyze; 


+A detailed theoretical discussion of one-loop corrections to the electroweak theory 
can be found in W. Hollik, Fortscr. cl. Plivsik 38, 165 (1990). 
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Figure 21.11. Examples of radiative corrections to //. decay in the weak in¬ 
teraction gauge theory: (a) lowest-order diagram; (b) propagator corrections; 

(c) vertex diagrams; (d) box diagrams; (e) real photon corrections; (f) the 
contribution of the (t,6) doublet. 

this is done in Problem 21.4. Loop corrections from particles that do not cou¬ 
ple directly to the external fermions are often termed oblique , since they enter 
the low-energy weak interactions only indirectly. 

Influence of Heavy Quark Corrections 

Our task, then, is to compute the corrections to relations (21.113) due to 
the (t, b ) doublet. These two relations depend on five observable quantities— 
mz, mw, A e LR , a , and Gr —with the last two parameters entering through 
0 W and Eq. (21.111). We will express these five quantities as functions of the 
bare parameters g, g' , and v, with corrections proportional to combinations of 
t and b vacuum polarization diagrams. The zeroth-order terms will naturally 
cancel out when we compute the corrections to the relations (21.113). 

The loop amplitudes that we require are shown in Fig. 21.12. To deal 
with these contributions most straightforwardly, we introduce a uniform no¬ 
tation for vacuum polarization amplitudes. Denote the vacuum polarization 
amplitude involving the gauge bosons I and J as 

= (21.114) 

where I and J may be 7 , W, or Z. When the gauge bosons are massive, 
the vacuum polarization amplitudes need not be transverse by themselves, so 
need not vanish at q 2 =0. Thus, we will change our notation from the 
case of QED and write the decomposition of Ujj(q) into tensor structures as 

n^(q)=U IJ (q 2 )g 1 ^ -A(q 2 )q^qr 


(21.115) 
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Figure 21.12. One-loop corrections from t and b to weak-interaction ob¬ 
servables: (a) m z \ (b) mw', (c) a; (d) Gp\ (e) A e LR . 

In all of the examples to follow, the factors q fl will dot into currents of light 
leptons, to give zero as in Eq. (21.97). Thus the form factor A (q 2 ) will drop 
out of our calculations. Our previous result that II flv {q) vanishes in QED at 
q 2 = 0 appears in this formalism as the set of constraints 

1X^(0) = n oZ (0) = 0. (21.116) 

For the other amplitudes, our sign conventions are chosen so that a positive 
value of IIjj(m 2 ) gives a positive mass shift to the gauge boson. Let us also 
define 

; (21.117) 

q 2 = 0 

this is the quantity we called 11(0) in Eq. (7.73). 

Now we use this notation to write the loop corrections to each of the five 
observables. The first two diagrams in Fig. 21.12 are simply mass corrections, 
and so, straightforwardly, 

ml = (<f + g ,2 )^+n zz (^, 

4 (21.118) 

m w = 9 + n wwijnw)- 

Note that both vacuum polarization amplitudes are evaluated at the poles in 
the respective propagators. To evaluate the shift of a by one-loop corrections, 
we consider the effect of Fig. 21.12(c) on the low-energy Coulomb potential. 
The values of the leading-order propagator and the one-loop correction com¬ 
bine to give the factors 

—*e 2 /_ , o. —i\ 

+ * n 77 ( 9 ") ' j ’ 


n' (0) = ^21 

'' dq- 


(21.119) 
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where, in this equation, e 2 is given in terms of bare variables as in (21.109). 
Thus, the observed value of a , in the limit q 2 —t 0, is modified according to 
the relation 


47ra = 


2 /2 
9 9 

g ' 2 + g' : 


(i + n; 7 (0)). 


( 21 . 120 ) 


In a similar way, the diagrams of Fig. 21.12(d) give a modified strength of 
the 4-fermion weak interaction process that leads to p. decay. The leading and 
one-loop diagrams sum to 

y IU W v y ni w 7 


Then the effective strength of the weak interaction vertex at q 2 = 0 is shifted 
as follows: 


Gf _ _j_ _ lb.rir'0 

\/2 2 v' 2 \ m 2 w 


( 21 . 122 ) 


Notice that, in the approximation of keeping only oblique corrections, the 
strength of every low-energy weak interaction amplitude is corrected by this 
same factor. 

Finally, the polarization asymmetry A e LR is corrected by a (it, 6) loop dia¬ 
gram according to Fig. 21.12(e). The analogous diagram with an intermediate 
Z° is summed into the Z° propagator and does not affect the form of the ver¬ 
tex. At zeroth order, the coupling of the Z° to any left- or right-handed light 
fermion is given, according to Eq. (20.71), by 


Q )• (2L123) 

The coefficient of Q is the bare value of sin 2 # ? „. The loop diagram in Fig. 
21 .12(e) adds to this a contribution 


iU Zl {q 2 )^ ■ (ieQ). (21.124) 

To discuss asymmetries at the Z° resonance, we set q 2 = m%. The term 
(21.124) adds to the piece of (21.123) proportional to Q; thus it shifts the 
bare value of sin 2 0 W . When we include this correction, the Z° coupling takes 
the form 

Vg ' 2 +g r2 {T 3 -s;Q), (21.125) 


where 


g ' 2 _ e IT [Z (ffl|) 

f + g ' 2 sjg 2 + (f 2 m z 


(21.126) 


The asymmetries at the Z° resonance discussed in Section 20.2 are computed 
as ratios of these couplings. Thus, to include the oblique radiative correction 
to ff{ R , for any light fermion species /, we reevaluate formula (20.96), using 
s' 2 in place of the zeroth-order sin 2 6 W . 
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We might, in fact, say that s 2 gives an additional way to define sin 2 9 W 
from observable quantities, to be compared to the definitions s\y given in 
(21.110) and Sq given in (21.111). Speaking strictly, the value of sin 2 9 W de¬ 
termined by the asymmetries at the Z° depends on the quark or lepton quan¬ 
tum numbers through vertex corrections that are not included in the analysis 
above. However, these species-dependent corrections are small and can be 
systematically subtracted to define a universal s 2 that determines the weak 
interaction asymmetries of all fermion species.* 

The three definitions of sin 2 8 W all agree at zeroth order but receive differ¬ 
ent radiative corrections. If we include only the oblique corrections, it is easy 
to produce compact formulae for the three quantities. From (21.126), we have 

s* = ~~y ~—- - sin ^ cos 9 W ^-^-. (21.127) 

g- + g'- m z 

In the prefactor of the one-loop correction, we can ignore the distinction be¬ 
tween the bare and renormalized values of sin 2 0 W . We can obtain a similar 
expression for by taking the ratio of the two formulae in (21.118): 

s w = 2 + n i 2 ~ ZZT ( u ww{m 2 w ) - —f^n zz(m 2 z )\. (21.128) 

y ~r y m z \ m z / 

Finally, we can evaluate the oblique corrections to sin 2 9q defined by (21.111). 
This is most readily done by writing S 6 o for the difference between the true 
and the bare value of 9q, and then expanding (21.111) as follows: 


2 cos ‘26q S9q = — sin 2 6q 


8 a SGf 
a Gf 


8 m z 


(21.129) 


The shifts of a, Gf, and m' z can be read from (21.120), (21.122), and (21.118). 
Then we can reconstruct 

sin 2 6 q = ^ 


+ 2 sin 9 0 cos OqSOq. 


(21.130) 


9 2 + 9 1 ' 2 

Assembling the pieces and evaluating the coefficients of the vacuum polariza¬ 
tion diagrams to zeroth order, we obtain 

sin 2 8 q = ^ 


9 2 + g L 


+ 


sin" 9 W cos 2 9 n 


cos 2 9,„ — sin 2 8 „ 


nl (0) H-— Xlwwify -—n zz(mz) ■ 

[ 1 m~ w m z J 

(21.131) 

It is not difficult to discover that each of the equations (21.127), (21.128), 
and (21.131) contains ultraviolet divergences. However, if the weak interac¬ 
tion gauge theory is renormalizable, these divergences should cancel when we 


*This is explained clearly in D. Kennedy and B. W. Lynn, Nucl. Pliys. B322, 1 
(1989). 
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compute the corrections to any zeroth-order natural relation. In the situa¬ 
tion that we consider, renormalizability implies that the various definitions of 
sin 2 6 W should differ only by expressions that are ultraviolet-finite. 

We are now almost prepared to check this prediction explicitly. We can 
clarify the structure of the ultraviolet divergences in our relations for the var¬ 
ious quantities sin 2 6 W by recasting the vacuum polarization amplitudes to 
make more explicit the quantum numbers to which the gauge bosons cou¬ 
ple. Recall from Eq. (20.71) that the Z boson couples to the combination of 
SU( 2) and electromagnetic quantum numbers (T 3 — sin 2 9 W Q). Similarly, the 
W bosons couple to T ± , or, equivalently, to T 1 , T 2 . It is useful to break up 
the vacuum polarization amplitudes into terms that depend on these specific 
quantum numbers. We will also extract the coupling constants indicated in 
(20.71). Thus we replace 

II77 = e U QQ , 

2 

= ( sin 6 u, cos 9 W ) ^ 3Q “ ^ 6wUqq] ’ 

, e X 2 . (21.132) 

n zz = ( x-x—) [II 33 - 2 sin“ 6 W TL 3 q + sin 1 6 w T[qq\ , 

v sin 'uj cos (/ uj' 

( ^ \ ^ 

1 T-ww = - —7t" iin, 

V sin 6 W ) 

where Q denotes the electric charge and 1,2,3 denote the components of 
weak-interaction SU( 2 ). 

A vacuum polarization amplitude can always be viewed as an expectation 
value of a pair of currents. From this viewpoint, the quantities on the right- 
hand side of (21.132) are expectation values of currents with definite quantum 
numbers. For example, II 33 is an expectation value of a pair of SU( 2) currents 
./ /j3 . Acting on the standard fermions, .//) is a left-handed current and -Jq is 
a vector current. 

The ultraviolet divergences in the expectation values of currents in 
(21.132) have the form 


II 33 ~ {A + Bq 2 ) log A 2 , 
n n ~ (A + Bq 2 ) log A 2 , 
n 3 Q ~ {Bq 2 ) log A 2 , 

IIqq ~ ( Cq 2 ) log A 2 . 


(21.133) 


We will demonstrate this explicitly later in this section. However, we can 
understand this structure from the following rough argument: Since the sym¬ 
metry of the theory should be recovered at large momentum, the amplitudes 
n 33 and ni t , which differ only by their orientation in the symmetry space, 
should have the same ultraviolet divergences. The divergence in the slope 
of n 3Q should be related to that in the slope of n 33 because Q = T 3 4 - Y 
and n 3 y is unimportant asymptotically since tr[T 3 }'] = 0. We pointed out 
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in Eq. (21.116) that II 3 q and IIqq vanish at q 2 = 0; thus they have no q 2 - 
independent divergences. 

Now we will rewrite the two zeroth-order natural relations in (21.113) in 
such a way that we can apply (21.133). To do this, we take the differences of 
Eqs. (21.127), (21.128), and (21.131) to obtain 


, 2 sin 2 8 W cos 2 8 W (U zz (m 2 z ) 

s; - sin 0 O = 1 


cos 2 8 ,„ — sin 2 8 „ 


m 


2 = 2 ® - n;,(0) 

mi y y 


z Ul w 

cos 2 8 W — sin 2 8 W II lZ {m 2 z ) 


sin 8 W cos 8 „ 




}■ 


2 „ 2 n ww(m L w ) , m 2 w n zz {m- z ) . n lZ (m- z ) 

o o * — 75- ~r -75-75- i bill u w VUb u w • 


,2 

l z 


m%, m? my 

(21.134) 

Inserting (21.132), and also using the relation mw = m z cos8 w in the coeffi¬ 
cients of terms already of one-loop order, we find after some algebra 


s 2 - sin" 80 = 


s w ~ s * ~ 


-{ [n 33 (m|) - n n (o) - n 3 Q (m|)] 


(cos 2 0 W — sin 2 0 w )m 2 z 

+ sin 2 8 W cos 2 8 W [n Qg (m|) - m|n^ Q (0)] |, 

[n 33 (m|) - n ii{m 2 w ) - sin 2 0^n 3Q (m|)]. 


sin 2 8 w m 2 z 


(21.135) 

If indeed the ultraviolet divergences of the vacuum polarization integrals have 
the structure of (21.133), then the divergent part of each expression in brackets 
in (21.135) vanishes, and the weak interaction gives definite, finite predictions 
for the differences of s 2 , s j v , and sin 2 0 O • 


Computation of Vacuum Polarization Amplitudes 

We can verify the divergence structure (21.133) by computing the vacuum 
polarization diagrams for t and b quarks explicitly. Rather than computing 
these one by one, it is easiest to compute, once and for all, the most general 
fermionic vacuum polarization amplitudes, and then to recover the amplitudes 
required in the previous paragraph as special cases of these. 

Consider, then, the two vacuum polarization amplitudes shown in Fig. 
12.13. The diagrams are built from two fermion propagators with different 
masses m\ and m 3 , linked by left- or right-handed currents. We call the vac¬ 
uum polarization amplitude with two left-handed currents II j^ L (q), and that 
with one left and one right-handed current II j" R (q). Since the vacuum polar¬ 
izations depend on only one momentum and two vector indices, there is no 
way that they can contain an invariant involving e t “' pa . Thus, the amplitudes 
with other combinations of currents are related to these by 

nM = nf fl (<z). 


nM = nf L (<z), 


(21.136) 
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Figure 21.13. Elementary vacuum polarization amplitudes of fermionic cur¬ 
rents. 

In addition, there is no difficulty in regularizing these diagrams using dimen¬ 
sional regularization with an anticommuting 7 0 , the regularization prescrip¬ 
tion we endorsed at the end of Section 19.4. The vacuum polarization of a 
vector current is reconstructed as 


Wl(q) = n Zl(q) + n £"(«/). (21.137) 

The vacuum polarization of purely left-handed currents is given by 
d 4 k 


= (-l) 


d 4 k 


(2kY 


tr 


' I+7' 


•1— 7 5 \ 

i(J/t + mi) 

. 2 ) 

k' 2 — m'l 

(1 —7 5 

\ i(tf+ 4 + m 2 ) ] 

V 2 

1 

1 

""S' 

+ 


1 


h r) 4 “ L ' VA ' 2 (k*-mi)((k + q)*-mSy 

(21.138) 

The prefactor ( — 1) comes from the fermion loop. There is no possible tensor 
structure antisymmetric in p, and v , so we can now drop the 7 0 term. From 
here, the calculation proceeds as in Section 7.5. We combine denominators 
using 

1 


1 


(k 2 — + q ) 2 — m 2 ) 


/• 


= / dx 


1 


(P - A) 2 ' 


(21.139) 


where 


= k + xq , 


A = ; 


+ (1 —x)m\ — ;r(l— x)q 2 . 


(21.140) 


Then, integrating with dimensional regularization and following the steps 
leading to Eq. (7.90), we find 


4 i 

(4 7r )d/2 


j dx ^S[a^ [x{l-x)q 2 


(21.141) 


— ^{xm\ + (l-r)™;)) — x(l—x )q fi q 1 ']. 

Notice that both II^ and its first derivative with respect to q 2 are logarith¬ 
mically divergent. 
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The vacuum polarization amplitude II RR can be obtained in a very similar 
fashion. From the Feynman rules, 


, i \ f d 4 k \ . / 1 — 7 5 mi) 

=< - 1) Jw a r H— 

. t: /^ 1 + T 5 A »(# + <j+m 2 y 

11 V 2 ) (k + q)' i 2 — m'l 

f d4k ^ r « v / n-T 5 \i 1 

= -J (2tF* , 'r m ' T m <~)\ { k?-, n iMk+<r--mzy 

(21.142) 

From here, the same manipulations as in the previous paragraph lead to 


2 i 

(47r ) d / 2 


/ dx ^Ld /2 [ 9 ltv mima] ■ (21.143) 


As a check, we can use (21.141), (21.143), and (21.136), setting mi = mo = m, 
to assemble the QED vacuum polarization of vector currents. We find 

n y v (q) = e 2 [nf L + nf R + n% L + n% R ] 

_8 ie 2 } r(2 ——) (21.144) 

= 77^m I dx a^ 7 T [^(! -x)q 2 g^ - x(l-x)q^q v \, 


(Ait ) d / 2 


where now A = m 2 — x(l—x)q 2 . This coincides precisely with our result from 
Section 7.5. 

As we argued below (21.115), only the terms in the vacuum polarization 
amplitudes proportional to g^ will enter our expressions for weak-interaction 
radiative corrections. Thus, we can summarize the calculation of the basic 
vacuum polarization amplitudes by quoting the results for this leading form 
factor: 


TlLLiq 2 ) = n RR(q 2 ) 


n LR.{q 2 ) = II LR,(q 2 ) 


_ t _ /dx F(2 2 ^ Pr(l—r)< 7 2 

(4 n )d/2 J aX / \2-d/2 PI 1 X > q 

0 

- \{xml + (l-x)mj)]; 

i 

2 f T(2--) 

— J dx [ TOlTO2 ] • (21.145) 

0 
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From these terms, we can assemble any desired vacuum polarization of t and b 
quarks in the weak-interaction gauge theory. To make use of these expressions 
more easily, we will expand the quantities (21.145) in the limit d —> 4. If we 
set e = 4 — d, the integrands of the expressions above simplify according to 

i T(2—d) i r*2 n 

(4^ /2 (4^r b - 7+log(4?r) - log A ] • (2L146) 

Let 

E = 2 - 7 + log( 47 r) - log! I/"). (21.147) 

where M is an arbitrary subtraction scale. It is useful to define 

l 

b 0 (12X) = b 0 (ml,ml,q 2 x ) = J dx log (A q%)/M 2 ), 

0 

1 

bi{12X) = 6 i {m\,ml,q\) = j dxx log(A(mp ml, q 2 x )/M 2 ), 

0 

1 

b 2 (12X) = b 2 (mj t ml,qx) = J dx x(l-x) log(A(mj, m\, q 2 x )/M 2 ). 

o 

(21.148) 

The abbreviated notation will prove useful below. In (21.148), X labels a 
momentum scale; we will need qx = 0,mw,niz■ Note that for equal masses, 

M1LY) = hb 0 ( 11 -Y). (21.149) 

With this notation, 

n LLiq'x) = ~ i(m? +ml))E - q 2 x h(V2X) 

(47T E 1 (21.150) 

+ |(m; 6 i( 12 A) +m?M2LY))] 

and 

n LR(qx) = ~ (4 ^ 2 [mim- 2 E - m 1 m 2 b 0 (V2X)]. (21.151) 

We can now reconstruct all of the specific vacuum polarization amplitudes 
that appear in Eq. (21.135) in terms of divergences proportional to E and 
finite parts proportional to the bj. The simplest is the expectation value of 
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electromagnetic currents, which is given in our present notation by 

n oo(4) = -3 ' (hx E ~ <lx b -2(. ux )) 


+ (|) 2 (i g | B-qjchibbX)) . 


(21.152) 


The prefactor 3 is the trace over colors. As we expect from QED, (21.152) 
contains a divergence only in a term proportional to q\. The divergent parts 
of the other amplitudes are 


H33te) - - 

nn(4) = - 


12 


(47r) 2 lihx - j( m t +m 2 b )]E + 


12 


(47t) 2 

n 3Q(4) = ' llhx] E + 


\[&x ~ \(m'i + m b)\ E + 


(21.153) 


These divergences indeed follow the pattern claimed in Eq. (21.133), and thus 
the predictions of the weak interaction gauge theory given in (21.135) are free 
of ultraviolet divergences. 


The Effect of rrit 

Using the notation we have developed, we can write the finite parts of the 
relations (21.135) in a compact form. The first relation becomes 


s 2 - sin 2 6*o = 


3a 


7r(cos 2 6 w — sin' 6 W ) 


{(\-l)b,{ttZ)+(\-\)b 2 {bbZ) 


2 2 

777 777 

- H-f [M«2) - blibtO)] + -f [MW>Z) - 6i(*60)]) 

TTiTTb 

+ 2 sin 2 6 W cos 2 6 W {^[b 2 (ttZ) — b 2 (tt0) — m%b'. 2 {tt0)] 

+ \[b 2 (bbZ)-b 2 {bbO) -m|fo'(660)])}. (21.154) 

Similarly, the second relation becomes 

{(i - | sin 2 9 W ) b 2 (ttZ) + (j - | sin 2 6 w )b 2 {bbZ) 

- \ cos 2 O w b 2 (tbW) 


2 2 
S W ~ s * ~ 


i r sin 2 6 V 


- (UZ) - hibtW)] + ^-MbbZ) - hitbW)}) }. 

TTI 7 TTI r7 * 

(21.155) 

Though it is now straightforward to work out the complete expressions 
for the relations (21.154) and (21.155), we will content ourselves here with 
identifying the most important term in the limit in which the t quark mass 
becomes large. Notice that, in each of these expressions, there are terms with 
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coefficients proportional to These are easiest to understand within 

the simpler combination of vacuum polarization amplitudes 

n„(0) -n 33 (0) = (hitto) -biibto)) + ml(b 1 (bbO) - hit.bO)) 


167T 2 


1 

/ ( Ttl? 

clx iyxm] log + (1 -v)ml log 


mr 


- {xml + (\—x)m\) log 


M 2 

xm] + (1 —x)ml 


M 2 


16n 2 

3 


1 

j dx | 


xml log ■ 


+ (l—x)r 


+ 


O(mg)} 


o 
1 

167T 2 4 


7 m t + °( m b) 


(21.156) 


for nil mi. If m ( is also much greater than rn-/. one can find a contri¬ 
bution proportional to m\/m\ in each of the relations (21.154), (21.155) by 
replacing the argument q\ = m z with q\ = 0, using (21.156), and ignoring 
all other contributions. One can show, by detailed examination of (21.154) 
and (21.155), that this procedure gives the complete leading term in m t . The 
result is 


s 


2 

* 


— sin 2 Q 0 = — 

2 2 _ 
S W ~ s * ~ ~ 


3 a ml 

167t(cos 2 9 w — sin 2 6 W ) m 2 z 
3 a ml 

-^-+ • • • , 

167T sin 2 9 W m z 


+ ■ ■ ■ , 


(21.157) 


where the omitted terms are of order a with no enhancement. 

The enhancement factor to 2 /m| is exactly the one that we found in our 
study of top quark decay in Section 21.2. It reflects the fact that some elec- 
troweak couplings of the top quark are effectively proportional to X t , the top 
quark coupling to the Higgs sector, instead of simply to the weak interaction 
coupling g. 

The complete numerical evaluation of the formulae for s 2 and s'l v is shown 
in Fig. 21.14. To compare the results of this section to experiment, we have 
included, in addition to the top quark effect, the mt-independent one-loop cor¬ 
rections from loops containing W and Z bosons and light quarks and leptons. 
In the figure, the predictions are compared to the value of s 2 obtained from 
the measurement of the Z° polarization and forward-backward asymmetries 
and the value of s'l v obtained from measurement of the W boson mass. 

According to the figure, the weak interaction gauge theory requires the top 
quark mass radiative correction (or a similar radiative correction from some 
other heavy particle) for its consistency with experiment. The top quark is 
predicted to have a mass approximately equal to 170 GeV. A recent analysis 
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Figure 21.14. Dependence of s" and on the top quark mass, for fixed a, 

Gp, mz■ The three curves in each group correspond to three different values 
of the Higgs boson mass: 100, 300, 1000 GeV from bottom to top. The curves 
are compared to values of s% and S^y, taken from the article of Langacker 
and Erler quoted in Table 20.1, and the CDF/DO value of the top quark mass 
given in Eq. (21.159). 

of all neutral current weak interaction data has given the prediction^ 

m t = 169 ±24 GeV. (21.158) 

Just as this book was being completed, the CDF and DO experiments at 
Fermilab announced the observation of the production of top quark pairs in 
proton-antiproton scattering. From kinematic fits to events believed to contain 
top quarks, these experiments reported* 

m t = 180 ± 13 GeV. (21.159) 

The discovery of the top quark in just the range required by precision elec- 
troweak measurements is quite remarkable. We can only conclude that, in the 
domain of weak interactions as well as those of electromagnetic, strong, and 
scalar interactions that we have studied earlier, the fluctuations predicted by 
quantum field theory make their imprint on the phenomena of Nature. 


!p. Langacker and J. Erler, in Review of Particle Properties, Phvs. Rev. D50, 
1304(1994). 

+F. Abe, et. ah, Phvs. Rev. Lett. 74, 2626 (1995); S. Abaclii, et. ah, Phvs. Rev. 
Lett. 74, 2632 (1995). 
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Problems 


21.1 Weak-interaction contributions to the muon g — 2. Tlie GWS model of 
the weak interactions leads to two new contributions to the anomalous magnetic mo¬ 
ments of the leptons. Because these contributions are proportional to Gpmj, they are 
extremely small for the electron, but for the muon they might possibly be observable. 
Both contributions are larger than the contribution of the Higgs boson discussed in 
Problem 6.3. 

(a) Consider first the contribution to the muon electromagnetic vertex function that 
involves a IP-neutrino loop diagram. In the R^ gauges, this diagram is accom¬ 
panied by diagrams in which W propagators are replaced by propagators for 
Goldstone bosons. Compute the sum of these diagrams in the Feynman-‘t Hooft 
gauge and show that, in the limit m\y 'JP m^, they contribute the following 
term to the anomalous magnetic moment of the muon: 


M v ) 


GFinl 10 
8tt 2 V2 3 ' 


(b) Repeat the calculation of part (a) in a general R £ gauge. Show explicitly that 
the result of part (a) is independent of £. 

(c) A second new contribution is that from a Z -muon loop diagram and the corre¬ 
sponding diagram with the Z replaced by a Goldstone boson. Show that these 
diagrams contribute 

"'• {Z) = ~^V=2 ' (3 + 3 Shl ' “ T Shl 6w )' 


21.2 Complete analysis of e+e —> W+W . 

(a) Using explicit polarization vectors, work out the amplitudes for e + e“ — } 
W + W~ from left- and right-handed electrons to states in which the W + and 
W~ have definite lielicity. For the cases in which both W bosons have longi¬ 
tudinal polarization, verify that Eq. (21.99) gives the correct high-energy limit 
for right-handed electrons, and verify the complete expression (21.108) for left- 
handed electrons. For the cases in which one W is longitudinally polarized and 
the second is transversely polarized, show that the individual diagrams give con¬ 
tributions to the amplitudes that grow like i/s, but that the complete amplitudes 
fall as 1 /sfs. 

(b) Show that the contributions to e^e^, — > W W + found in part (a) reproduce 

Fig. 21.10, and that the differential cross section for ^ W~W + is about 

30 times smaller. How many of the qualitative features of the figure can you 
understand physically? 

21.3 Cross section for du — > W~ 7 . Compute the amplitudes for du -5- W~p 
for the various possible initial and final lielicities. Ignore the quark masses. In this 
approximation, only the annihilation amplitude from dpUR is nonzero. Show that the 
scattering amplitudes for all final lielicity combinations vanish at cos 8 = — 1/3, where 
6 is the scattering angle in the center-of-mass system. Compute the differential cross 
section as a function of cos 6. 
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21.4 Dependence of radiative corrections on the Higgs boson mass. 

(a) Consider the contributions to weak-interaction radiative corrections involving 
the physical Higgs boson h° of the GWS model. The couplings of the h° were 
discussed near the end of Section 20.2. Show that, if we ignore terms proportional 
to the masses of light fermions, the Higgs boson contributes one-loop corrections 
to the processes considered in Section 21.3 only through vacuum polarization 
diagrams. It follows that the contributions to vacuum polarization amplitudes 
that depend on the Higgs boson mass are gauge invariant. 

(b) Draw the vacuum polarization diagrams in Feynman-‘t Hooft gauge that involve 
the Higgs boson, and compute the dependence of the various vacuum polarization 
amplitudes on the Higgs boson mass ?«/,. 

(c) Show that, for 3> m\y, the natural relations discussed in Section 21.3 receive 
corrections 


•s* — -Sfl — 


(l + 9sin 2 0 w ) m 


2 9 W — sin 2 9 1 , 


cos 

2 2 5 ml 

S W ~ s * = a Ti — 1°6 —■ 

24 7r 


48 n 


log- 


h 

7 2 ’ 

1 w 


The effect of varying m /, is displayed in Fig. 21.14 and is included as a theoretical 
uncertainty in the prediction (21.158). More accurate experiments might allow 
one to predict m/, from its effect on electroweak radiative corrections. 
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Decays of the Higgs Boson 


At the end of Section 20.2, we discussed the mystery of the origin of sponta¬ 
neous symmetry breaking in the weak interactions. The simplest hypothesis 
is that the SU( 2) x U( 1) gauge symmetry of the weak interactions is broken 
by the expectation value of a two-component scalar field <f>. However, since 
we have almost no experimental information about the mechanism of this 
symmetry breaking, many other possibilities can be suggested. 

Eventually, this problem should be resolved by experimental observa¬ 
tion of the particles associated with the symmetry breaking. To form incisive 
experimental tests, we should compute the properties expected for these par¬ 
ticles. We saw in Section 20.2 that, if the symmetry is indeed broken by a 
single scalar field <j >, the symmetry-breaking sector contributes only one new 
particle, a scalar h° called the Higgs boson. The mass nih of this particle is 
unknown. However, the couplings of the h° to known fermions and bosons are 
completely determined by the masses of those particles and the weak inter¬ 
action coupling constants. Thus, it is possible to compute the amplitudes for 
production and decay of the h° in some detail. More complicated models of 
SU{2) x 17(1) symmetry breaking typically contain one or more particles that 
share some properties with the h°. Thus, this study is a useful starting point 
for the more general analysis of experimental tests of these models. 

In this Final Project you will compute the amplitudes for the Higgs bo¬ 
son h° to decay to pairs of quarks, leptons, and gauge bosons. The computa¬ 
tions beautifully illustrate the working of perturbation theory for non-Abelian 
gauge fields. Those decays of the Higgs boson that involve quarks and gluons 
bring in aspects of QCD. Thus, this exercise reviews all of the important tech¬ 
nical methods of Part III. Except for a question raised at the end of part (a), 
the problem relies only on material from unstarred sections of Part III. The 
material in Chapter 20 plays an essential role. Material from Chapter 21 en¬ 
ters the analysis only in parts (b) and (f), and the other parts of the problem 
(except for the final summary in part (h)) do not rely on these. If you have 
studied Section 19.5, you will have some additional insight into the results of 
parts (c) and (f), but this insight is not necessary to work the problem. 

Consider, then, the minimal form of the Glashow-Weinberg-Salam elec- 
troweak theory with one Higgs scalar field cp. The physical Higgs boson h° of 
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this theory was discussed in Section 20.2, and we listed there the couplings of 
this particle to quarks, leptons, and gauge bosons. You can now use that in¬ 
formation to compute the amplitudes for the various possible decays of the h° 
as a function of its mass m/ t . You will discover that the decay pattern has a 
complicated dependence on the mass of the Higgs boson, with different decay 
modes dominating in different mass ranges. The dependences of the various 
decay rates on mh illustrate many aspects of the physics of gauge theories 
that we have discussed in Part III. 

In working this exercise, you should consider m/ ( as a free parameter. For 
the other parameters of weak-interaction theory, you might use the following 
values: toj, = 5 GeV, m t = 175 GeV, mw = 80 GeV, mz = 91 GeV, sin 2 9 W 
= 0.23, a s (mz ) = 0.12. 

(a) Compute first the rate for h° —> //, where / is a quark or lepton of the 

standard model. After a completely trivial computation, you should find 


T(h° -)■ 



4 ml \ 3/2 




m 




' N c (f ), 


where N c (f ) = 1 for leptons, 3 for quarks. If you have studied Chapter 
18, you might improve this result for the case in which the fermion / 
is a quark, by computing the leading-log QCD corrections for the case 
mn 3> m q . Remember that the quark mass m q is determined at the quark 
threshold M 2 ~ m 2 . 

(b) If mi, > 2 mw, the Higgs boson can decay to W + W~; if it is just a bit 
heavier, it can also decay to Z°Z°. Compute the decay widths to these 
final states. You can check your result in the following way: If m* 
mw, the dominant contribution to the decay comes from production of 
longitudinally polarized W or Z bosons, and this contribution can be 
estimated at follows: 


T{h° -> W+W~) ss T(h° ->• <p + 4>~), r (h° -)■ Z°Z°) « T(h° -)■ <? 3 ^ 3 ), 

where <p ^, cp 3 are the Goldstone bosons of the Higgs sector and the quan¬ 
tities on the right-hand sides of these relations are computed in the un- 
gauged Higgs theory. Explain why this statement should be true, and 
verify it explicitly. 

(c) The third important decay mode of the Higgs is the decay to 2 gluons. 
The amplitude for this decay is generated by diagrams involving quark 
loops. Compute these diagrams, using dimensional regularization. The di¬ 
agrams will be finite, but nevertheless there is a subtle contribution which 
apparently depends on the regulator. Check that you have computed the 
amplitude correctly by verifying that it is gauge invariant. Then square 
the amplitude and construct the decay rate. You should find 


T(h° -)■ 
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where the sum runs over all quark species and I(m|/m 2 ) is a form factor 
with the property that I(x) —> 1 as x —» 0 and I(x) — > 0 as x — > oo. 
This property implies that the dominant contribution to the decay rate 
comes from very heavy quarks. You need not evaluate I(x) explicitly at 
this stage; just leave it in the form of a Feynman parameter integral. 
The existence of the process h° —> 2g implies the existence of the inverse 
process g + g —> h°, which is a mechanism for production of Higgs bosons 
in proton-proton collisions. Using the parton model, derive a relation 
between the partial width T(h° —> 2g) and the total cross section for 
pp —> h° + X. Compute this cross section numerically (in nanobarns) for 
a 30 GeV Higgs for pp collisions of center of mass energy 1-40 TeV. (1 
TeV = 10 3 GeV.) For the purposes of this problem set (though this is 
not actually a good approximation) you may ignore the Q 2 dependence 
of the gluon distribution function and take simply 

fg(x) =8-i(l-;c) 7 . 

You may also set /(m^/m|) = 1; this is correct to about 10%. 

The final decay mode that you should consider is h° —)■ 2y. Consider first 
the contribution from the loop diagrams involving quarks and leptons. 
Show that the result is simply expressed in terms of the form factor 
/(m|/m 2 ) that you derived in part (c). 

Next, compute the contribution to this decay amplitude from the loop 
diagram involving W bosons, and the various diagrams one must add to 
this to obtain a gauge-invariant result. It is easiest to work in Feynman- 
‘t Hooft gauge. Add this contribution to that of very heavy quarks and 
leptons, each with electric charge Qf. Your result should reduce to the 
following expression in the limit m* -C mw- 


T(h° -> 
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Now work out the detailed behavior of the form factor I(x) defined in part 
(c). Reduce your expression from part (c) to a one-parameter integral, 
then evaluate this integral numerically. Plot I(m 2 h Jm 2 ) over the range 50 
GeV < nih < 500 GeV, and compute the decay width T(h° —> 2 g) numer¬ 
ically (in keV) over this range. The computation of part (f) introduces 
an addition form factor; compute this function in the same way. 

Finally, put together all the pieces. Find the branching fraction of the 
h° into each of its five major decay modes bb, tt, gg, W + W~, Z°Z°, for 
Higgs bosons of mass 50 GeV - 500 GeV. 




Epilogue 




Chapter 22 


Quantum Field Theory at the Frontier 


In this textbook we have surveyed the most important ideas of quantum field 
theory. Working from the basic concepts that come from fusing relativity, 
quantum mechanics, and fields, we have built an elaborate structure, which 
includes such remarkable elements as coupling constant renormalization and 
non-Abelian gauge fields. We have seen at many points that the strange and 
abstract elements of this structure actually intersect with observation and even 
produce explanations for unexpected aspects of the behavior of elementary 
particles. 

In the course of our study, we have arrived at a complete theory of the 
strong, weak, and electromagnetic interactions of elementary particles. Each 
element of this theory has been described as a quantum field theory, and 
these quantum field theories have turned out to have very similar structure 
as gauge theories coupled to fermions. At various points in our discussion, we 
have noted that these theories have passed stringent quantitative experimental 
tests. We have not had space to describe the wide variety of experiments 
that contribute to our faith in these theories, but today almost all particle 
physicists consider this SU( 3) x SU('2) x U{ 1) gauge theory as established. In 
fact, most of these people refer to this theory condescendingly as ‘the standard 
model’. 

Though the best data to support the standard model have come from 
experiments of the past five years, the ideas behind it are much older. Most 
of the theoretical developments described in this book were concluded in the 
1970s, a generation removed from the current frontier of physics. But this 
does not mean that quantum field theory is irrelevant to that frontier, any¬ 
more than quantum mechanics and electrodynamics have lost their relevance 
after many y^ears of exploration. On the contrary, the theory of elementary 
particles—like other areas of physics that depend on quantum fluctuations in 
continua—still holds deep mysteries, and quantum field theory remains the 
principal tool for exploration of these questions. 

In this final chapter, we will flash forward to the present day and discuss 
the relevance of quantum field theory to current questions in the physics 
of the fundamental interactions. We will present what are, in our view, the 
outstanding unsolved problems of elementary particle physics and describe 
how quantum field theory is being used to confront these problems. Many of 


781 
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these applications involve aspects of quantum field theory that are beyond 
the scope of this book. These include the use of quantum field theory in the 
regime beyond the reach of perturbation theory and the use of quantum field 
theory to explore the special properties of theories with higher spin and local 
symmetry. In these areas our discussion will be mainly qualitative, but we will 
give references that provide points of entry into each of these subjects. 

It should be obvious that our discussion in this final chapter will express 
our personal opinions and by no means represents the consensus of experts 
in quantum field theory. In addition, any collection of ‘current problems’ in 
a rapidly developing area of research should quickly become dated. In fact, 
we hope the readers of this book will quickly make this chapter obsolete by 
solving the problems that we highlight here. 

22.1 Strong Strong Interactions 

One paradoxical aspect of our discussion of the strong interactions is that all 
of our concrete results were obtained by assuming that these interactions are 
weak. At large momentum transfer, we argued, this assumption is actually 
valid due to asymptotic freedom. Still, it is uncomfortable that we have left 
the most obvious questions about strongly interacting particles—for example, 
their masses and low-energy interactions—in a mysterious regime excluded 
from our analysis. 

To work with QCD in the region where the strong interactions are strong, 
we need to answer three questions: First, how do we describe the forces that 
bind quarks together into hadrons? Second, what is an appropriate description 
of the quark-antiquark and three-quark systems bound by those forces? And 
finally, how do we compute scattering amplitudes and matrix elements of 
currents using these bound states? 

At this moment, there is no derivation of the complete force between 
quarks from the QCD Lagrangian. Explicit calculations can be done only 
in the limits of weak and strong coupling. In the weak-coupling limit, the 
result is a Coulomb potential with an asymptotically free coupling constant. 
The strong coupling limit, on the other hand, gives a linear potential which 
confines color in the way that we described, but did not derive, at the end 
of Section 17.1. The derivation of this result is quite unusual and brings in a 
new set of mathematical methods. 

So far in this book, we have not discussed a strong coupling approximation 
to a quantum field theory. There is a simple reason for this: In a quantum 
field theory in which the coupling g 2 is very large, the elementary particles or 
their bound states typically acquire masses that grow with g 2 . For g 2 —> oo, 
these masses become comparable to the cutoff A and the field theory ceases 
to have a local continuum description. 

Wilson proposed to solve this problem in a radical way, by replacing 
spacetime with a lattice of discretely spaced points. Such a lattice is easiest 
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to visualize in Euclidean spacetime, and so we can use a functional integral 
over fields on a lattice to approximate Euclidean Green’s functions. Such a 
theory can have a well-defined strong coupling limit. A theory of this type is 
very similar to a lattice model of a magnetic system. 

In fact, we can understand this construction of a quantum field theory 
by using the concepts of Chapter 13. A lattice theory with fluctuating spin 
variables at each lattice site is described in the large by a quantum field 
theory of scalar fields with the symmetry of the underlying spin variables. 
Typically, the strong-coupling limit of the quantum field theory corresponds 
to the high-temperature limit of the magnet, in which the correlation length 
is much smaller than the lattice spacing. Decreasing the coupling constant 
corresponds to decreasing the temperature. Eventually, the coupling constant 
comes close to a fixed point of the renormalization group, and one can use 
this fixed point to define a limit of the lattice functional integral in which the 
lattice spacing is taken to zero. 

To build a lattice model of the strong interactions, one needs to find a set 
of variables on the discrete lattice that correspond in the large to non-Abelian 
gauge fields. Wilson proposed that the fundamental variables for such a theory 
should be the line elements from one lattice vertex v\ to a neighboring vertex 
v-2, 

U(v 2 ,«i ) = P exp \igj dx IJ A“f“]. (22.1) 

To construct the lattice gauge theory with gauge group G, one should inte¬ 
grate over a finite group transformation U for each link of the lattice. Tak¬ 
ing a product of these U matrices around a closed path, one can construct 
gauge-invariant observables, just as we did in Section 15.3. An appropriate 
Lagrangian can also be constructed as a sum of gauge-invariant products of 
the U matrices about elementary closed loops of the lattice.* 

In a spin system, the defining property of the high-temperature phase is 
the exponential decay of correlations 

<s(0) • s(x)) ~ exp[— |x|/£] (22.2) 

as |x| —»• oo. The analogous property of the gauge-invariant correlation func¬ 
tion of U matrices around a closed path P is 

(n^)~exp[-A/a (22.3) 

p 

where A is the area spanned by the path. This behavior is in fact seen explicitly 
in the expansion of Wilson’s lattice gauge theory for strong coupling. A pair 
of color sources that stand a distance R apart for a Euclidean time T are 
represented by a large rectangular loop of width R and length T. For such a 


*Tliis construction was introduced by K. Wilson, Phvs. Rev. DIO, 2445 (1974). 
Tlie construction is described pedagogically in M. Creutz, Quarks , Gluons, and Lattices 
(Cambridge University Press, Cambridge, 1983). 
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loop, we can compare the result (22.3) to the expression for the energy of an 
excited state in Euclidean time, 

(exp [■ -H e T ]) ~ exp [-RT /?]. (22.4) 

Then we see that static sources of gauge charge, in the strong-coupling limit, 
are attracted to one another by a potential energy 

V(R) ~ R/( 2 (22.5) 

at sufficiently large R. Similarly, when one introduces dynamical quarks into a 
lattice gauge theory and studies their properties in the strong-coupling limit, 
configurations with large separation of color sources are suppressed in the 
Euclidean functional integral by factors of the form of (22.3). The strong¬ 
coupling limit then predicts the permanent confinement of quarks into color- 
singlet bound states. 

The argument we have just given applies equally well to gauge theories 
based on Abelian or non-Abelian symmetry groups. But non-Abelian gauge 
theories have the important additional property of asymptotic freedom. In this 
context, that implies that a theory with weak coupling at short distances flows 
to a theory with strong coupling at large distances. If we imagine integrating 
out short-distance degrees of freedom as we described in Section 12.1, and 
if there is no zero of the 3 function or other barrier to the renormalization 
group flow, we should eventually arrive at an effective theory for which the 
strong-coupling expansion is a good approximation. Thus, in the particular 
case of non-Abelian gauge theories, asymptotic freedom allows us to connect 
a short-distance picture based on free quarks and gluons to a large-distance 
picture based on color confinement. 

It would be wonderful if the strong-coupling picture that we have de¬ 
scribed led to mathematical equations in continuum spacetime describing the 
motion of permanently confined quarks and antiquarks. Many authors have 
tried to write such equations by imagining the area suppression of the Wilson 
loop correlation function (22.3) to result from a physical surface that spans the 
loop. For the large rectangular loop associated with color sources, this surface 
can be interpreted physically as the lines of color electric flux that run from 
one source to the other (as in Fig. 17.1), swept out through Euclidean time. 
At one moment of Euclidean time, this surface can be idealized as an abstract 
one-dimensional excitation, often called a string. Unfortunately, the quantum 
properties of an idealized string turn out to be very complicated, since each 
small element of the string must be considered as an independent quantum 
degree of freedom. The only systems of string equations that have actually 
been solved have bizarre features, including unwanted massless particles. Up 
to now, no one has succeeded in writing an equation for the quark-confining 
string that is useful for quantitative calculations of quark bound states.t 

^For one approach to color confinement from a picture involving Wilson loops and 
strings, see A. A. Migdal, Pliys. Repts. 102 , 199 (1983). 
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However, the lattice regularization of a non-Abelian gauge theory suggests 
another approach to quantitative calculations in strong-interaction theory. By 
approximating QCD by a lattice gauge theory with a nonzero lattice spacing 
and a finite spacetime volume, we reduce the functional integral to a finite 
number of bounded integrations, that is, an integral over 517(3) group matri¬ 
ces for each of the finite number of links in the lattice. A lattice of size, for 
example, 20 4 allows the lattice spacing to be smaller than the size of a hadron 
while the full size of the lattice is much larger than a hadronic radius. Then 
one can compute correlation functions by evaluating the integrals numerically, 
by the Monte Carlo method. Since the functional integral with a finite lattice 
spacing is related to the original functional integral with zero lattice spacing 
by integrating out short-distance degrees of freedom, the lattice approxima¬ 
tion can be systematically improved by computing the short-distance effects 
perturbatively, using asymptotic freedom to justify a weak-coupling analysis.* 

This numerical method has now become the principal theoretical tool for 
quantitative calculations in hadron physics. This method currently gives the 
masses of the low-lying mesons and barvons to accuracies of 10-20%; it also 
allows the calculation of weak interaction matrix elements of hadrons at the 
25% level. As computers become more powerful, this numerical method can 
be pushed to higher accuracy. 

Eventually, it will be interesting to ask whether these nonperturbative 
numerical calculations are consistent with our precision knowledge of the per¬ 
turbative region of QCD. At the time of this writing, the first such comparison 
has been made: We have listed in Table 17.1 a value of a s from ip and X spec¬ 
troscopy. In this calculation, the experimentally determined masses of cc and 
bb bound states are compared to values computed numerically with lattice reg¬ 
ularization. The comparison of these values gives the required bare coupling 
constant of the lattice theory, which can be converted to a value of a s (mz) 
in the convention of the table. The resulting estimate for a s (niz ) does agree 
reasonably well with purely perturbative determinations. 

What is the future of nonperturbative calculations in hadron physics? On 
the one hand, we expect to see further development of numerical lattice meth¬ 
ods. These methods have hardly begun to address problems of hadron-hadron 
scattering and multiparticle matrix elements, and this seems an important di¬ 
rection for the future. In addition, these methods should eventually supply an 
engineering understanding of hadrons at the percent level or better. On the 
other hand, we hope also to see a quantitative continuum approach to hadron 
structure, in which dynamical quarks interact with some appropriate type of 
string degrees of freedom. 


iFor an introduction to numerical lattice gauge theory, see From Actions to An¬ 
swers, T. DeGrand and D. Toussaint, eds. (World Scientific, Singapore, 1990). 
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22.2 Grand Unification and its Paradoxes 

If we put aside our questions about the low-energy, nonperturbative behavior 
of QCD, the SU( 3) x SU( 2) x U( 1) gauge theory gives an apparently complete 
description of elementary particle interactions at those energies that we have 
probed experimentally. But what happens beyond our current reach? Does 
this theory need modification, or could it continue to be valid at much higher 
energies? 

The SU( 3) x SU( 2) x U (1) gauge theory contains three independent gauge 
coupling constants, and the observed values of these couplings are larger for 
the larger components of the gauge group. This pattern can be explained by a 
bold hypothesis about the behavior of the gauge couplings at very high energy. 
If at some very large energy scale, these three couplings were equal, the values 
of the SU( 3) and SU( 2) couplings would increase at smaller momentum scales 
due to their asymptotically free renormalization group equations, while the 
value of the U( 1) coupling would decrease, resulting in the observed pattern of 
couplings at low energies. An even bolder hypothesis would be that the three 
gauge symmetries are subgroups of a single large symmetry group, which is 
spontaneously broken at very high energy scales. The simplest choice for this 
larger symmetry is SU( 5). In that theory, the coupling constants of SU( 3) x 
SU( 2) x [7(1) have the following relation to the underlying SU( 5) coupling at 
the scale of SU(5) breaking: 



The idea that the SU( 3) x SU(2) x [7(1) gauge group is embedded in a larger 
simple group is known as grand unification ; the particular choice of SU( 5) as 
the unifying group is due to Georgi and Glashow.* The observed quarks and 
leptons can be seen to fit neatly into an anomaly-free chiral representation of 
SU(5 ); this embedding leads to a natural explanation of the fractional charges 
of quarks d 

Within this framework, we can extrapolate the values of the three coupling 
constants from the energy scale of mz upward. The result of this extrapolation 
is shown as the solid lines in Fig. 22.1. The coupling constants do come close 
together at very high energies, though they do not actually meet. The dashed 
lines in the figure show the evolution with a modified set of renormalization 
group equations, to be explained in Section 22.4; with this choice, the three 
couplings meet accurately within their current uncertainties. In any event, 
the evolution of coupling constants occurs on a logarithmic scale in energy, so 
grand unification cannot be achieved without assuming an enormous value—of 
order 10 16 GeV—for the symmetry-breaking scale. 


*H. Georgi and S. L. Glashow, Phvs. Rev. Lett., 32, 438 (1974). The remarkable 
hubris of this paper makes it required reading for every student. 

iFor a pedagogical introduction to grand unification, see Ross (1984). 
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Figure 22.1. Extrapolation in energy of the c oupl ing constants of the 
SU( 3) x SU( 2) x U( 1) gauge model, </ 3 , g, and 5/3 g'. The solid lines are 
plotted using the 0 functions corresponding to the known set of elementary 
particles; the dashed lines are plotted using the 0 functions corresponding to 
a supersymmetric multiplet of particles. 

The idea of a grand unification at such enormous energies raises many 
difficult questions, but it also suggests a wonderful opportunity. There is an¬ 
other enormous energy scale in quantum field theory, the scale at which the 
gravitational attraction of elementary particles becomes comparable to their 
strong, weak, and electromagnetic interactions. Conventionally, one defines 
the Planck scale as the energy for which the gravitational interaction of par¬ 
ticles becomes of order 1: 

mpianck = ( G N /fic )“ 1 / 2 ~ 10 19 GeV. (22.7) 

However, already at energies of order 10 18 GeV, the gravitational attraction 
of particles is comparable to the gauge force due to the vector bosons of a 
grand unified theory. Though this scale is still slightly higher than the scale 
at which the standard model coupling constants meet, it is not unreasonable 
to hope that grand unification is somehow related to the unification of gravity 
with the forces of elementary particle physics. 

On the other hand, the introduction of this large scale into physics leads 
to a number of conceptual problems. The first of these problems, which one 
meets immediately upon suggesting this extension of the standard model, is 
the Higgs boson mass. In our discussion at the end of Section 20.2, we came 
to a somewhat ambiguous conclusion about the nature of the Higgs boson. As 
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a part of the gauge theory of weak interactions, we need some new sector that 
will cause the spontaneous breaking of SU( 2) x 17(1). This might be supplied 
by the vacuum expectation value of a scalar field, or by the more complicated 
dynamics of a new sector of particles. At this moment, we do not know which 
hypothesis is to be preferred. 

If 517(2) x 17(1) is broken by the vacuum expectation value of an elemen¬ 
tary scalar field, that scalar field should be part of the grand unification. This 
leads to a serious conceptual problem. In order to produce a vacuum expec¬ 
tation value of the right size to give the observed W and Z boson masses, the 
Higgs scalar field must obtain a negative mass term, of the size 

-/r ~ -(100 GeV) 2 . (22.8) 

Unfortunately, the (mass) 2 of a scalar field receives additive renormalizations. 
In a theory with cutoff scale A, /r can be much smaller than A 2 only if the 
bare mass of the scalar field is of order —A 2 , and this value is canceled down 
to — /r by radiative corrections. If we envision that our theory of Nature 
contains the very large scales of grand unification, we must take seriously the 
idea that the appropriate value to take for A in this discussion is 10 16 GeV or 
larger. This seems to require dramatic and even bizarre cancellations in the 
renormalized value of jr. 

We met a situation of this type in the theory of phase transitions. At zero 
temperature, a ferromagnet typically has a spin expectation value of the order 
of the underlying atomic parameters. As the temperature is raised, or as some 
other variable in the system is changed, the magnetization decreases. Finally, 
by fine adjustment of the temperature, we can arrive at a situation where the 
system approaches a critical point. In the very near vicinity of this point, the 
expectation value of the spin field is much smaller than the value predicted 
from atomic parameters, and the system is described by an approximately 
massless continuum scalar field theory. 

In statistical mechanics, this picture of the light scalar field makes sense 
because there is an experimenter sensitively adjusting a dial. In the theory of 
weak interactions, there is no one obviously making a fine adjustment that 
gives the (mass) 2 of the Higgs boson a value 28 orders of magnitude or more 
below its natural value. Thus, it is a mystery why the Higgs boson mass should 
be so small compared to the grand unification scale. Particle physicists refer 
to this question as the gauge hierarchy problem. 

How can one naturally arrange a Higgs field mass term to be so much 
smaller than the underlying mass scale of the fundamental interactions? One 
possible strategy would be to arrange for a symmetry of the fundamental 
Lagrangian that forbids the Higgs boson mass term and that is very weakly 
broken. This idea turns out to be very difficult to implement. To build a theory 
of this type, one would need to create a scalar field theory in which additive 
radiative corrections to the Higgs boson mass must cancel to any foreseeable 
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order in perturbation theory. But the Higgs mass term is very simple in form, 

A£ = p 2 |0| 2 , (22.9) 

and it is hard to imagine any principle that would keep this term from being 
generated by radiative corrections. There is one proposal for a symmetry 
with this property, but it requires the introduction of a profound principle 
called supersymmetry that entails deep modifications of fundamental physics. 
In particular, it requires a large number of new elementary particles, some 
of which should have masses below 1000 GeV, within the reach of the next 
generation of accelerators. We will discuss this possibility further in Section 
22.4. 

In this discussion, the problem of the Higgs mass stemmed from the hy¬ 
pothesis that the Higgs boson was an elementary particle. An alternative view¬ 
point, already suggested at the end of Section 20.2, is that the Higgs boson is 
a composite state bound by a new set of interactions. This idea also leads to 
observable experimental consequences, since the mass scale of these new in¬ 
teractions must be close to the weak interaction mass scale. In the simplest 
theories of this type, the symmetry breaking of the Higgs sector is modeled 
on the dynamical chiral symmetry breaking of the strong interactions, which 
we discussed in Section 19.3. The new strong interactions required by the the¬ 
ory lead to a spectrum of new particles with masses of about 1000 GeY.* 
Thus, the two conflicting hypotheses on the nature of the sector that breaks 
SU(‘2) x £7(1) both lead to new phenomena observable at future accelerators, 
and possibly even at present ones. 

Just as these two different theories of the Higgs sector present com¬ 
pletely different answers to the question of why the weak-interaction symmetry 
SU(‘2 ) x £7(1) should be spontaneously broken, they also imply completely dif¬ 
ferent answers to the question of the origin of the quark and lepton masses. In 
a model in which the Higgs field is elementary, the quark and lepton masses 
are derived from the renormalizable couplings of fermions to the Higgs field. 
These couplings would presumably be part of the grand unification and could 
be predicted only by theories that made explicit reference to the grand unifi¬ 
cation scale. In principle, the knowledge of these couplings could give us clues 
as to the details of the grand unification. Even if the Higgs field is compos¬ 
ite, we cannot avoid the fact that the generation of masses for the quarks and 
leptons requires the breaking of SU( 2) x £7(1). Thus, these mass terms must 
arise from couplings of the quarks and leptons to the Higgs sector of interac¬ 
tions. In this class of models, the interactions leading to the quark and lepton 
masses must arise at energies close to the scale of the Higgs sector strong 
interaction and may eventually be observable experimentally. 

From either viewpoint, it is still mysterious why the spectrum of quarks 


+ Tlie properties of these models of the Higgs sector, known to specialists as tech¬ 
nicolor models, are described in R. Kaul, Rev. Mod. Phvs. 55, 449 (1983) and K. D. 
Lane, in The Building Blocks of Creation , S. Raby, ed. (World Scientific, 1993). 
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and leptons covers 5 orders of magnitude, from the electron at 0.5 MeV to 
the top quark at 175 GeV. It is also not understood what gives rise to the 
pattern of quark mixings encoded in the CKM matrix and the magnitude of 
CP violation. Even with detailed confirmation of the standard model, these 
questions seem today very far from solution. 

The enormous mass scale of grand unification can also enter one more 
physical quantity, one that poses an even greater paradox than that of the 
Higgs boson mass. When we first quantized a field in Section 2.3, we discovered 
that the energy density of the vacuum in free scalar field theory received an 
infinite positive contribution from the zero-point energies of the various modes 
of oscillation. With a cutoff scale A, this zero-point energy is given roughly 
by 

<0| H |0) ~ A 4 . (22.10) 

At many other points in our discussion, we found similarly large contributions 
to the vacuum energy. The filling of the Dirac sea in the quantization of the 
free fermion theory led to a downward shift in the vacuum energy with a 
similar ultraviolet divergence. Spontaneous symmetry breaking gives a finite 
but still possibly large shift in the vacuum energy density, 

A (0| H |0) ~ —cv 4 , (22.11) 

with dimensionless c, for a field vacuum expectation value v. The spontaneous 
breaking of the weak interaction 517(2) x 17(1) symmetry and of the strong in¬ 
teraction chiral symmetry both would be expected to shift the vacuum energy- 
density in this way. 

In elementary particle physics experiments, this shift of the vacuum en¬ 
ergy is unobservable. Experimentally measured particle masses, for example, 
are energy differences between the vacuum and certain excited states of H, and 
the absolute vacuum energy cancels out in the calculation of these differences. 
However, there is a way that the absolute vacuum energy could potentially 
be observed, through the coupling of the vacuum energy to gravity. Accord¬ 
ing to Einstein, the energy-momentum tensor of matter Q 1 ' 1 ' is the source of 
the gravitational field. A vacuum energy density (0| H |0) = A contributes to 
this source a term 

0 '“' = N(Q^) + Xg^, ( 22 . 12 ) 

where the first term on the right is subtracted to have zero vacuum expecta¬ 
tion value. The vacuum energy term has the form of Einstein’s cosmological 
constant and thus potentially affects the expansion of the universe. 

In fact, measurements of the cosmological expansion exclude a large cos¬ 
mological constant. The current limit is 

A < 10“ 29 g/cm 3 ~ (1CT 11 GeV) 4 . (22.13) 

We have no understanding of why A is so much smaller than the vacuum 
energy shifts generated in the known phase transitions of particle physics, 
and so much again smaller than the underlying field zero-point energies. The 
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discrepancy in A between the experimental bound (22.13) and naive intuition 
is 120 orders of magnitude! The solution to this problem may come from one 
of many sources. It may be that the formalism of the quantum field theory 
of gravity requires that the vacuum energy be subtracted from the energy- 
momentum tensor that appears in Einstein’s equations of gravity. It may be 
that there is a new physical mechanism coming from particle physics or from 
gravity itself that sets the total vacuum energy to zero. Or it may be that 
the overall scale of energy-momentum is genuinely ambiguous and is set by a 
cosmological boundary condition. At this moment, all of these possibilities are 
just guesses. All we know for certain is that the unification of quantum field 
theory and gravity cannot be straightforward, that there is some important 
concept still missing from our current understanding.* 


22.3 Exact Solutions in Quantum Field Theory 

From the idea of grand unification, with its great promise and mystery, we 
turn to the study of model quantum field theories that are so simple that they 
can be solved exactly. Throughout this book, we have stressed the intrinsic 
complexity of quantum field theory and the importance of using perturbation 
theory as a replacement for exact knowledge. But there are a variety of quan¬ 
tum field theories for which exact solutions are known. In this section, we will 
describe some of these and review the insights we have gained from them. 

In searching for exact solutions to quantum field theory models, there 
is no reason to restrict our attention to four-dimensional spacetime. In fact, 
we have seen examples of two-dimensional theories with similar complexity of 
renormalization and short-distance behavior. At the same time, these theories 
occupy a one-dimensional space, and their degrees of freedom can be visualized 
as links in a chain. This allows some powerful simplifications. 

In our discussion of the axial anomaly in two dimensions in Section 19.1, 
we showed that the photon of two-dimensional massless QED becomes a mas¬ 
sive boson. More detailed examination of this theory shows that this boson 
is a noninteracting particle. The theory is originally formulated in terms of 
fermions, interacting through Coulomb forces. However, it is possible to ex¬ 
actly rewrite the theory as a theory of a scalar field that creates and destroys 
fermion-antifermion pairs. Heuristicallv, a particle and an antiparticle moving 
down the light-cone in one-dimensional space do not separate and therefore 
comprise one bosonic degree of freedom. In a wide class of models, the bosonic 
theories rewritten in this way are free-field theories. A remarkable model of 
this type is the Thirring model, 

C. = Tpilflip - (22.14) 


*Tlie cosmological constant problem and a variety of unsuccessful solutions are 
reviewed in S. Weinberg, Rev. Mod. Phys. 61, 1 (1989). 
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in two dimensions. In this model, the replacement of the fermion field by a 
boson field leads to a free field theory. Using this field theory, one can com¬ 
pute correlation functions of fermion bilinears explicitly and show directly 
that these operators have anomalous dimensions. In renormalization-group 
language, the model contains a line of fixed points parametrized by the cou¬ 
pling constant g d 

A more general class of two-dimensional models can be solved by visu¬ 
alizing them in a Hamiltonian picture as a one-dimensional chain of coupled 
field operators. The prototype of this method is a problem in the statistical 
mechanics of magnets, the one-dimensional chain of coupled spins. Consider a 
long chain of N discrete sites, with a spin-1/2 system at each site. The Pauli 
sigma matrices <r, act on the two-dimensional Hilbert space at the site i. The 
Hamiltonian for the spin chain is then 

H = (22.15) 

i 

Since 

CTj • (T i+1 = 2(<T+ <J- +1 + <j— af +1 ) + vfcrf+1 , (22.16) 

this Hamiltonian conserves the number of up spins. The state with all spins 
down is an eigenstate of the Hamiltonian, and the states with one spin up in a 
state of definite momentum are also eigenstates. In 1934, Bethe analyzed the 
problem of two spins up and computed their 5-matrix. He then discovered an 
amazing fact, that by multiplying the 5-matrices for the two-spin problem, he 
could find the exact eigenstates of the Hamiltonian for any number of spins up. 
By considering N/2 spins up, he found the ground state of the system. This 
technique, now known as Bethe’s ansatz , has been used to solve a wide variety 
of one-dimensional problems in condensed matter physics and quantum field 
theory. For example, this technique has been used by Andrei and Lowenstein 
to solve the Gross-Neveu model presented in Problem 11.3 and to demonstrate 
that the spectrum of this model has the properties expected from asymptotic 
freedom.+ 

Even where it is not possible to solve a model for all values of its parame¬ 
ters, it is sometimes possible to find exact information about two-dimensional 
models at points where they contain massless fields. It is well known that a va¬ 
riety of classical two-dimensional partial differential equations can be solved by 
exploiting techniques of complex variables. For example, the two-dimensional 
Laplace equation V 2 0 = 0 is invariant to conformal mappings z —> w(z), 

fFor an introduction to these models, see S. Coleman, Phvs. Rev. Dll, 2088 
(1975), Ann. Phvs. 101, 239 (1976). 

+For an introduction to Bethe’s ansatz and its generalizations, see N. Andrei, K. 
Furuya, and J. H. Lowenstein, Rev. Mod. Phys. 55, 331 (1983), L. D. Faddeev, in 
Recent Advances in Field Theory and Statistical Mechanics, J. B. Zuber and R. Stora, 
eds. (Nortli-Holland, Amsterdam, 1984), and R. J. Baxter, Exactly Solved Models in 
Statistical Mechanics (Academic Press, London, 1982). 
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where 2 = x + iy. Two-dimensional quantum field theories with massless 
particles often have this conformal symmetry at the classical level, though 
generically it is anomalous. In special systems, however, these anomalies van¬ 
ish and the quantum theory is invariant to conformal mapping. These theories 
typically contain operators with anomalous dimensions, indicating that each 
such theory is a new, nontrivial fixed point of the renormalization group. The 
conformal symmetry of the theory can be used to compute these anomalous 
dimensions. 

As an example of this class of theories, consider the two-dimensional non¬ 
linear sigma model in which the basic field is not a unit vector, as we discussed 
in Section 13.3, but rather a unitary matrix of a Lie group G. The Lagrangian 
of this theory is 

£ = J-Jd 2 x tr [dfJJ^d^U ]. (22.17) 

Like the theory of Section 13.3, this model is asymptotically free. However, 
Witten has shown that, by adding to this Lagrangian a particular perturbation 
of a rather complicated form first written by Wess and Zumino, one can find a 
fixed point of the renormalization group with manifest G x G global symmetry. 
This theory is conformally invariant, and all operator correlation functions can 
be computed using the conformal symmetry.* 

One result of the nonperturbative exploration of quantum field theory 
was the realization that field theories can contain particle states that are not 
simply related to the quanta of the original fields. In the weak-coupling limit 
of a quantum field theory, such new states can appear as new solutions of the 
classical field equations. Consider, for example, ( ft 4 theory in two dimensions 
in the broken-symmetry phase. The equation of motion is 

^ ( P~-^2 ( P-^ ( P + X( t >3 =0 - ( 22 - 18 ) 

Treating this equation as a classical partial differential equation, we can find 
the time-independent solution 

4>{x) = -y= tanh -^=- (22.19) 

V A v2 

This is a field configuration that begins in one well of the potential at x = —00 
and crosses over to the other well as 2 : —1 + 00 . This solution has an energy 
of order p,/A, larger by a factor of 1/A than the mass of a rj) quantum. Since 
the original equation (22.18) was Lorentz-covariant, the boosts of this solution 
must also be solutions to the classical partial differential equation. It is natural 
to suggest that, in the ( ft 4 quantum field theory, these solutions form a new 
set of massive particles. Such solutions, and the particles corresponding to 


*For an introduction to comformally invariant two-dimensional quantum field 
theories, see P. Ginsparg, in Fields, Strings, Critical Phenomena, E. Brezin and J. 
Zinn-Justin, eds. (Nortli-Holland, Amsterdam, 1989). 
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them, are often called solitons, borrowing a more specialized term from the 
literature on two-dimensional partial differential equations.' 1 ' 

Many examples are now known of particles that are associated in this 
way with classical solutions of a quantum field theory. In theories with spon¬ 
taneously broken symmetry, the appearance of such particles is often related 
to the topology of the set of vacuum states; the ( ft 4 theory above gives a simple 
example of this relation. These examples are not limited to two dimensions 
but can also occur in theories that are potentially realistic. Such solutions 
can have magical properties. One interesting example is found in the SU( 2) 
gauge theory with a Higgs scalar field in the vector representation, the Georgi- 
Glashow model considered in Section 20.1. ‘t Hooft and Polyakov showed that 
this theory has a classical solution in which the Higgs field q> a has the form 

<M*) =/(|x|)x a . (22.20) 

They showed that, when the gauge theory is interpreted as a unified model of 
weak and electromagnetic interactions, this solution is a magnetic monopole! 
In addition, particles that arise as heavy classical states in the weak coupling 
limit can have a more intricate relation to the dynamics of the theory when the 
coupling is increased. For example, in theories of the type of two-dimensional 
QED or the Thirring model in which fermions can be replaced by bosons, a 
weak-coupling limit is obtained by adding to the theory a large fermion mass. 
Then the original fermions are recovered from the bosonic representation of 
the theory as classical solutions very similar to that given in (22.19). 

In some theories, one can find classical solutions of the Euclidean field 
equations. These solutions, called instantons, are localized in Euclidean time 
as well as in space. Thus, they are interpreted as quantum processes that 
modify the effective Hamiltonian of a quantum field theory. The most famous 
example of an instanton is found in four-dimensional non-Abelian gauge theo¬ 
ries. It was shown by ‘t Hooft that this field configuration leads to a quantum 
process that violates the conservation of the U(l) axial current in QCD. We 
have explained in Section 19.3 that this violation of current conservation is 
exactly what is needed to explain the spectrum of light mesons in QCD. 

There is probably much more to be learned, especially about the strong¬ 
coupling behavior of gauge theories, by deeper analysis of the classical solu¬ 
tions to the field equations, and of the interrelations of the many exactly or 
partially solvable two-dimensional field theories. 


iFor an introduction to the use of solutions of the classical field equations in the 
analysis of problems in field theory, see S. Coleman (1985), Chaps. 6 and 7, and R. 
Rajaraman, Solitons and Instantons (North-Holland, Amsterdam, 1982). 
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22.4 Supersymmetry 

Among the properties that a quantum field theory might possess to make it 
more beautiful or more mathematically tractable, there is one higher sym¬ 
metry with particularly far-reaching implications. This is a symmetry that 
relates fermions and bosons, known (without hyperbole) as supersymmetry. 
In this section, we will introduce some of the purely mathematical conse¬ 
quences of supersymmetry, and then discuss the question of whether the true 
field equations of Nature could be supersymmetric. 

A generator of supersymmetry is an operator that commutes with the 
Hamiltonian and converts bosonic into fermionic states. Such an operator must 
carry half-integer spin, in the simplest case spin 1/2. Let Q a , with a = 1,2, 
be the left-handed spinor components of this operator. Their Hermitian con¬ 
jugates, Q\. form a right-handed spinor. The anticommutator {Q a , Qt;} is 
a 2 x 2 matrix with positive diagonal elements; thus it cannot vanish. This 
matrix commutes with H but transforms nontrivially under Lorentz transfor¬ 
mations. A Lorentz-covariant expression for this anticommutator is 

{Qa,Q^} =2<3 PM , (22.21) 

where P M is a conserved vector quantity. Such quantities are severely re¬ 
stricted; a theorem of Coleman and Mandula states that, if a quantum field 
theory in more than two dimensions has a second conserved vector quantity- 
in addition to the energy-momentum 4-vector, the 5-matrix equals 1 and no 
scattering is allowed. Thus the only possible choice for P ,( in Eq. (22.21) is the 
total energy-momentum. The Coleman-Mandula theorem also rules out any 
higher-spin conservation laws. This eliminates the possibility that a supersym¬ 
metry generator could have spin 3/2 or higher. The most general possibility 
is a collection of spin-1/2 operators with the anticommutation relations 

{QlQ j J} =26^^, (22.22) 

with i,j = 1,..., N. In the following discussion, we will mainly consider only 
the simplest case, N = id 

The algebra (22.22) of conserved quantities has profound conseqences for 
the theory. Since the right-hand side of (22.22) is the total energy-momentum, 
it involves every field in the theory. To reproduce this algebra, the left-hand 
side must also involve every field. The representations of this algebra pair 
every bosonic state with a fermionic state at the same energy, and vice versa. 
If supersymmetry is an exact symmetry of the quantum field theory, it must 
act on every sector of the theory. In a realistic model, even the gravitational 
field must have a fermionic partner. This means that Einstein’s equations of 
gravity must be generalized to a new set of geometrical equations that involve 
a fermionic (spin-3/2) field. 


+An excellent introduction to the formalism of supersymmetry is J. Wess and J. 
Bagger, Supersymmetry and Supergravitv (Princeton University Press, 1983). 
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The first consequences of making a quantum field theory supersymmetric 
are easy to understand. For every (complex) scalar field, one must introduce 
a chiral fermion field. The self-interactions of the bosonic fields are related 
to the interactions of these fields with the fermions; for example, a possible 
interaction Lagrangian with coupling constant A is 

AC = -\ 2 \<t>\ 2 - ^Xip T <t 2 ip. (22.23) 

We have written a more general supersymmetric Lagrangian in Problem 3.5. 
Similarly, for every gauge field, one must introduce a chiral fermion in the 
adjoint representation of the gauge group. This fermion, called the gaugino , 
mediates interactions of the scalar fields with their fermionic partners whose 
strength is given by the gauge coupling g. 

The special relation between the bosonic and fermionic interactions leads 
to great simplifications in the renormalization of supersymmetric theories. 
Some of these simplifications can be anticipated. Since supersymmetry re¬ 
quires that each scalar particle have a fermionic partner of the same mass, 
these particles must have the same mass renormalization. But we have seen 
that the fermion mass is multiplicatively renormalized and thus is only log¬ 
arithmically divergent, while a scalar mass term is additively renormalized 
and thus can be quadratically divergent. Supersymmetry must imply that 
the quadratic divergences of scalar mass terms automatically vanish. In fact, 
these cancellations occur in every order of perturbation theory, with loop dia¬ 
grams involving bosons canceling against diagrams with virtual fermions. To 
see another simplification required by supersymmetry, take the vacuum ex¬ 
pectation value of the anticommutation relation (22.21). The vacuum state 
has zero momentum: P‘ |0) = 0. If the vacuum state is supersymmetric, 
Q a |0) = Qjj |0) = 0. Then Eq. (22.21) implies 

(0| H |0) = 0. (22.24) 

We have noted already that bosonic fields give positive contributions to the 
vacuum energy through their zero-point energy, and fermionic fields give neg¬ 
ative contributions. We now see that, in a supersymmetric model, these con¬ 
tributions cancel exactly, not only at the leading order but to all orders in 
perturbation theory. 

Deeper examination of supersymmetric theories leads to additional, and 
quite unexpected, cancellations in renormalization theory. For example, one 
can show that the coupling constants in scalar-fermion self-interactions, such 
as A in (22.23), are renormalized only through field strength renormalizations. 
Thus the relative size of two different scalar interactions remains unchanged. 
If a particular type of renormalizable interaction is omitted, it cannot be gen¬ 
erated by renormalization, in contrast to the case in ordinary field theory. 
The simplest supersymmetry does not constrain the renormalization of gauge 
couplings, but higher supersymmetries can have a profound effect: In N = 2 
supersymmetric models, the fl function vanishes if the leading-order term is 
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arranged to be zero. In N = 4 supersymmetric models, this cancellation is au¬ 
tomatic and /3(g) =0 exactly. These models give examples of four-dimensional 
quantum field theories with no ultraviolet divergences.* 

Supersymmetry thus endows a quantum field theory with remarkable, 
even magical properties. But is it possible that the true equations of Nature 
could possess such a high degree of symmetry? Since we are certain that 
there is no charged boson with the same mass as the electron, we know that 
supersymmetry cannot be an exact symmetry of Nature. But it is tempting 
to guess that it might be a spontaneously broken symmetry of the underlying 
equations. 

In fact, this conjecture has fruitful consequences for the grand unified the¬ 
ories that we discussed in Section 22.2. The problem of the Higgs boson mass 
that we highlighted in that section has an elegant solution in supersymmetry 
models. In a supersymmetric version of the standard model, the Higgs field 
is one of a large number of scalar fields with various SU( 3) x SU('2) x U{1) 
quantum numbers. For all of these scalar fields, the mass terms get only a 
logarithmic multiplicative renormalization. If supersymmetry were broken in 
such a way as to give mass differences of a few hundred GeV between the ob¬ 
served fermionic quarks and leptons and their scalar partners, one would also 
find a Higgs boson (mass) 2 of the correct size. There are good reasons, which 
follow from more detailed properties of the theory, why it is the Higgs field, 
rather than some other scalar field, that obtains a vacuum expectation valued 

If this set of ideas is correct, the scalar partners of quarks and leptons 
would be light enough to be discovered experimentally in the near future. In 
that case, these scalar particles and the fermionic partners of gauge bosons 
would affect the renormalization of coupling constants between present en¬ 
ergies and the grand unification scale. This might potentially disturb the 
prospects for grand unification, but, instead, it improves them: the dashed 
lines of Fig. 22.1, with a more impressive meeting of the three coupling con¬ 
stants, were generated by replacing the conventional 3 functions with ones 
including the supersymmetric partners. 

The last of the problems discussed in Section 22.2 is also ameliorated by 
the introduction of supersymmetry. In a grand unified theory with broken 
supersymmetry, those momentum scales that are much larger than the mass 
differences of supersymmetry partners give no contribution to the vacuum 
energy. Thus the natural size of the cosmological constant in these theories 
is A ~ (100 GeV) 4 . This reduces the cosmological constant problem to a 
discrepancy of 50 orders of magnitude—but this is not nearly enough. 


*Supersymmetric models with vanishing ,/3 function are reviewed by P. West, in 
Shelter Island II, R. Jackiw, N. N. Kliuri, S. Weinberg, and E. Witten, eds. (MIT 
Press, Cambridge, 1985). 

1 Supersymmetric models of quarks and leptons, and their observable conse¬ 
quences, are reviewed in H. P. Nilles, Pliys. Repts. 110, 1 (1984), and in H. E. Haber 
and G. L. Kane, Plivs. Repts. 117, 75 (1985). 
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It is an exciting prospect that supersymmetric partners of the particles of 
the standard model might soon be seen in experiments. What we anticipate, in 
any event, is that the experiments of the next generation will make a definite 
choice between this hypothesis for the nature of the Higgs sector and the other 
possibilities discussed in Section 22.2. Either way, we will have advanced our 
knowledge one step toward the truly fundamental equations. 

22.5 Toward an Ultimate Theory of Nature 

What are these fundamental equations? Do they involve quantum field theory, 
or some very different organizing principle? Any answer to this question must 
be completely speculative. Nevertheless, there are some principles, and an 
example, that can guide this search. 

For all the attention we have given in this book to the basic interactions 
of particle physics, we have given very little attention to gravity. In part, 
this is because the quantum theory of gravity has no known observational 
consequences. But it is also true that the quantum theory of gravity is still 
ill-formed and uncertain. If gravity is treated as a weak-coupling field the¬ 
ory with Feynman diagrams, one quickly finds that the divergences of these 
diagrams render the theory nonrenormalizable. This is no surprise, because 
gravity is a theory in which the coupling constant has inverse mass dimen¬ 
sions, with the mass scale mpi anc k given by (22.7). In our general philosophy 
of renormalization, all of the complexity of this theory should be concentrated 
at the scale 

At the scale where quantum fluctuations of the gravitational field are im¬ 
portant, we must expect profound changes in physics. If these changes occur 
within the context of quantum field theory, they will at the least entail fluc¬ 
tuating spacetime geometry and topology. But it seems equally probable that 
quantum field theory will actually break down at this scale, with continuous 
spacetime replaced by a new discrete or nonlocal geometry. 

One particular model for the behavior of spacetime at very small dis¬ 
tances is string theory , the dynamics of abstract one-dimensional extended 
objects. In Section 22.1, we mentioned that such objects seemed to occur 
naturally in attempts to describe quark confinement in QCD, but that the 
detailed properties of these objects made them unsuitable for strong inter¬ 
action phenomenology. Among the disappointing properties of these systems 
were the appearance of massless spin-2 states of the string, and a constraint 
that the dimension of spacetime must be increased unless the spectrum of the 
theory contained many massless spin-1 states. In 1974, Scherk and Schwarz 
made the remarkable suggestion that string theory was a correct mathemat¬ 
ical description of a different problem, the unification of elementary particle 
interactions with gravity. They interpreted the spin-2 quantum as the gravi¬ 
ton and the spin-1 quanta as gauge bosons of a gauge theory.! A decade later, 


lj. Sclierk and J. H. Scliwarz, Nucl. Phys. B81, 118 (1974). 
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Green and Schwarz put this conjecture on a firmer footing by showing that 
a particular string theory could be interpreted as a grand unified theory in 
ten spacetime dimensions, with all gravitational and gauge Ward identities 
automatically satisfied and all anomalies automatically canceling. Since that 
time, many other solutions to the constraint equations of string theory have 
been found, some of which correspond to unified models of gauge interactions 
and gravity in four dimensions. These models can naturally incorporate su¬ 
persymmetry and, under that condition, give ultraviolet-finite results for all 
scattering amplitudes, including those of gravitons.* 

String theories solve the ultraviolet divergence problems of quantum field 
theory by rejecting the idea that elementary particles are pointlike objects 
with contact interactions. Rather, in string theory, quarks, leptons, gauge 
bosons, and gravitons are extended loops of string excitation which thus in¬ 
teract nonlocally. Since particles cannot be definitely localized, spacetime it¬ 
self takes on a nonlocal character. In some sense, distances much less than 
the Planck length l/mpi anc k do not exist in the string description of grav¬ 
ity. As yet, it is not clear how to understand intuitively the sort of geometry 
that string theory requires. This mathematical problem is now being actively 
investigated. 

If indeed the truly fundamental geometry of Nature is nonlocal, discrete, 
or discontinuous in some other way, then the grand program for the fun¬ 
damental interactions that we have set forth in this book must be altered 
in an essential way. The most elementary equations of Nature will not be 
gauge-invariant quantum field theories, but instead theories built from very 
different elements. Even the principles of model construction will be differ¬ 
ent from those based on gauge and Lorentz invariance that we have discussed 
here. 

On the other hand, quantum field theory will still play an essential role in 
the interpretation of this structure. All of the processes we now observe, even 
the elementary particle processes at the highest energies currently accessible, 
occur over distances 15 orders of magnitude larger than the sizes of the strings 
or other fluctuating structures that appear in the underlying equations. The 
relation of experimental observations to these fundamental structures is thus 
very similar to the relation of macroscopic observations to the underlying 
atomic structure of matter. In the study of matter, we use a classical, New¬ 
tonian description of atoms to bridge this gap and to relate the properties of 
gases, liquids, and solids to underlying atomic properties. We might say that 
the quantum theory of atoms gives rise to a set of effective Newtonian equa¬ 
tions that is extremely powerful in the macroscopic domain. Especially in the 
theory of gases, this Newtonian description was also used as a tool to realize 
the existence of atoms and to derive their properties. 


*A technical introduction to string theory and its use in building unified models 
has been given by M. B. Green, J. H. Schwarz, and E. Witten, Superstring Theory, 2 
vols. (Cambridge University Press, 1987). 
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Similarly, whatever the nature of Planck-scale physics, it leads to some 
effective continuum quantum field theory. This quantum field theory might 
well be an accurate approximation to the underlying physics already at dis¬ 
tances of 100 Planck lengths, corresponding to momenta of 10 17 GeV. From 
here to the scale of weak interactions, and from there up to the wavelength 
of light, and from there to the size of the universe, quantum field theory can 
be treated as the basic framework for the equations of physics. By recogniz¬ 
ing the symmetries of the particular set of field equations that Nature has 
provided us, we can learn to compute all of the details of physical processes 
over this whole enormous domain. And, by contemplating the origin of these 
symmetries, perhaps we will also be able to see through to the next level and 
unlock the true structure of spacetime. 
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This Appendix collects together some of the formulae that are most commonly 
used in Feynman diagram calculations. 

A.l Feynman Rules 

In all theories it is understood that momentum is conserved at each vertex, and 
that undetermined loop momenta are integrated over: Jd 4 p/(2ir) 4 . Fermion 
(including ghost) loops receive an additional factor of ( — 1), as explained on 
page 120. Finally, each diagram can potentially have a symmetry factor, as 
explained on page 93. 

cf) 4 theory: C = i(<9,,p) 2 - ^m 2 <j> 2 - f> 4 

1 

Scalar propagator: = —----— (A.l) 

/r — m- + te 

(jj l vertex: = — i\ (A.2) 


External scalar: 


= 1 


(A.3) 


(Counterterm vertices for loop calculations are given on page 325.) 
Quantum Electrodynamics: £ = ip(i$ — m)tp — \(F^ V ) 2 — eib^f^ipA^ 


Dirac propagator: 


Photon propagator: 


i(y + m) 
p 2 — m 2 + ie 

p 2 + ie 


(A-4) 


(A.5) 


(Feynman gauge; see page 297 for generalized Lorentz gauge.) 
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QED vertex: = iQe 7 ^ (A. 6 ) 

(Q = —1 for an electron) 

= u s {p ) (initial) 

External fermions: (A.7) 

= u s (p ) (final) 

= fi s (p) (initial) 

External antifermions: (A.8) 

= v s (p) (final) 

= e fl {p) (initial) 

External photons: (A.9) 

= £*,{p) (final) 

(Counterterm vertices for loop calculations are given on page 332.) 

Non-Abelian Gauge Theory: 

C = ip(i$ — m)ib — j(<9 /( A“ — 3„A “) 2 + gA^pb^^iAtb 

~ gf abc {d jJ A < l)A ilb A vc - \g 2 {f eab A“A b „){f ecd A pc A , ' d ) 

The fermion and gauge boson propagators are the same as in QED, times 
an identity matrix in the gauge group space. Similarly, the polarization of 
external particles is treated the same as in QED, but each external particle 
also has an orientation in the group space. 

Fermion vertex: = igj p t a (A.10) 


gf abc W‘ v {k-pY 

3- boson vertex: = +g l ' p {p — q) > ‘ (A.11) 

+ g pfi (q~kT] 

_ ig 2[f ab e f d e {g , Pg ,a- g ,a g , P ) 

4- boson vertex: = A f ace f bde {g pv g pa —g pa g vp ) (A.12) 

+ f ade f bce (g^g p<T -g pp g va )\ 
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Ghost vertex: 


= - gf abc P M (A.13) 


iS ab 

Ghost propagator: = —-- (A. 14) 

p- + xe 


(Counterterm vertices for loop calculations are given on pages 528 and 532.) 


Other theories. Feynman rules for other theories can be found on the fol¬ 
lowing pages: 


Yukawa theory 
Scalar QED 
Linear sigma model 
Electroweak theory 


page 118 
page 312 
page 353 

pages 716, 743, 753 


A.2 Polarizations of External Particles 

The spinors u s (p ) and v s (p) obey the Dirac equation in the form 
0 = (y — m)u s (p) = u s (p)(tf— to) 


= (tf+m)v s (p) = v s (p) (y + to) , 
where y = 7 . The Dirac matrices obey the anticommutation relations 


(A.15) 


{ 7 ", 7 "} =2<T- 


We use a chiral basis, 


7" = 




7 5 = 


-1 0 

0 1 


where 


cr''=(l,<r), <t" = (1 ,-<t). 


In this basis the normalized Dirac spinors can be written 


s, x (Vp-°Z S 

“ {p) = {^c 


s/ \ l Vp ■ a n 

v s {p) = v —=. , 

V—S/P - 


(A.16) 

(A.17) 

(A.18) 

(A.19) 


where £ and p are two-component spinors normalized to unity. In the high- 
energy limit these expressions simplify to 


,(p) ns/2E[\ 


|(1 ~ P ' <*)¥ 


r(l +P ' 


v(p ) ~ V2E\ 


( \{l-p-a)if 


V—w ( 1 + P ' (r}>l s 


(A.20) 
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Using the standard basis for the Pauli matrices 

0 1\ 2 /0 -i 

1 0) ’ a ~ (i 0 


a 1 = 


a 3 = 


1 0 
0 -1 


(A.21) 


we have, for example, £ s = (J) for spin up in the z direction, and = ( < j ) ) for 
spin down in the z direction. For antifermions the physical spin is opposite to 
that of the spinor: if = Q corresponds to spin down in the z direction, and 
so on. 

In computing unpolarized cross sections one encounters the polarization 
sums 


y u s (p)u s ( p) = f + m, ^2 vS (p) 1,s ( P ) = / — m. 


(A.22) 


For polarized cross sections one can either resort to the explicit formulae 
(A.19) or insert the projection matrices 

1 ¥ : )’ (H 21 )- < A - 23 > 

which project onto right- and left-handed spinors, respectively. Again, for 
antifermions, the helicity of the spinor is opposite to the physical helicity of 
the particle. 

Many other identities involving Dirac spinors and matrices can be found 
in Chapter 3. 

Polarization vectors for photons and other gauge bosons are convention¬ 
ally normalized to unity. For massless bosons the polarization must be trans¬ 
verse: 

e #< = (0, e), where p • e = 0. (A.24) 


Ifp is in the +z direction, the polarization vectors are 


= 7 = (°> 1 >®> °), 


= -Uo,i, —0), 


(A.25) 


n/T 1 1 ’ 

for right- and left-handed helicities, respectively. 

In computing unpolarized cross sections involving photons, one can re¬ 
place 

y ' e /i ^ (A.26) 

polarizations 

by virtue of the Ward identity. In the case of massless non-Abelian gauge 
bosons, one must also sum over the emission of ghosts, as discussed in Sec¬ 
tion 16.3. In the massive case, one must in addition include the emission of 
Goldstone bosons, as discussed in Section 21.1. 
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A.3 Numerator Algebra 


Traces of 7 matrices can be evaluated as follows: 

tr(l) = 4 
tr(any odd # of 7 ’s) = 0 

tr(W') = W" 

tr( 7 Y/7 ff ) = 4 W lv r ~ g pp g v ° + g? a g vp ) (A.27) 

tr( 7 5 ) = 0 

tr( 7 ''' ' ' r ' l = 0 

Mt'VtVt 5 ) = -4 ie pvpa 

Another identity allows one to reverse the order of 7 matrices inside a trace: 


X,r{Y~1 V l p 1° ■■■)= tr(-. 

Contractions of 7 matrices with each other simplify to: 

= 4 

7 " 7 " 7 „ = - 27 " 
7 m 7V7m = W 

= -^ a i p Y 


(A.28) 


(A.29) 


(These identities apply in four dimensions only; see the following section.) 
Contractions of the e symbol can also be simplified: 

e a ^ S e Ql37S = -24 

= -6 (A.30) 

^% Pf = -2(J7 ( -iX) 

In some calculations, it is useful to rearrange products of fermion bilinears 
by means of Fierz identities. Let u 1,... ,11.4 be Dirac spinors, and let u,x = 
4(1 — 7 5 )ui be the left-handed projection. Then the most important Fierz 
rearrangement formula is 


(u 1L y‘u-2L)(u 3 L7 f iU4L) = ~ {u 1L ^ P U 4L ) • (A.31) 

Additional formulae can be generated by the use of the following identities 
for the 2x2 blocks of Dirac matrices: 


(.G P ')a/3( k &n') / yd — 2c a -, (cd ) n'j'yd —2 e a ^e^§. (A.32) 

In non-Abelian gauge theories, the Feynman rules involve gauge group 
matrices t a that form a representation r of a Lie algebra G. The symbol G 
also denotes the adjoint representation of the algebra. The matrices t a obey 

[t a ,t b ]=if abc t c , (A.33) 
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where the structure constants f abc are totally antisymmetric. The invariants 
C(r) and C 2 (r) of the representation r are defined by 

tr [t a t b ] = C(r)S ab , t a t a = C 2 (r)-l. (A.34) 

These are related by 

cw = m C2{r) ’ (A - 35) 

where d(r) is the dimension of the representation. Traces and contractions of 
the t a can be evaluated using the above identities and their consequences: 

t a t b t a = [C 2 (r) - ~C 2 (G)]t b 
facdfbcd = C . 2 (G)S ab (A.36) 

f abc t b t c = \iC 2 {G)t a 

For SU(N ) groups, the fundamental representation is denoted by N, and 
we have 

C'(rV) = 1, C 2 (N) =C{G) C,(G) V. (A.37) 

The following relation, satisfied by the matrices of the fundamental represen¬ 
tation of SU(N), is also very helpful: 

(t a )ij(t a )ki = - (Ac Ay - — ApU,^. (A.38) 


A.4 Loop Integrals and Dimensional Regularization 


To combine propagator denominators, introduce integrals over Feynman pa¬ 
rameters: 


1 

A\ A 2 ■ ■ ■ A n 



■dx n 5(J2xi- 1 ) 


(n — 1 )! 

[xiAi +x 2 A 2 H- x„A n ] n 


(A.39) 


In the case of only two denominator factors, this formula reduces to 

l 

I f 1 

= / dx --. (A.40) 

AB •/ [xA + {l-x)By 

A more general formula in which the A, : are raised to arbitrary powers is given 
in Eq. (6.42). 

Once this is done, the bracketed quantity in the denominator will be a 
quadratic function of the integration variables pf. Next, complete the square 
and shift the integration variables to absorb the terms linear in p f ‘. For a 
one-loop integral, there is a single integration momentum which is shifted 
to a momentum variable A'. After this shift, the denominator takes the form 
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(£ 2 — A) n . In the numerator, terms with an odd number of powers of l vanish 
by symmetric integration. Symmetry also allows one to replace 

t p £ v ^-J 2 g p \ (A.41) 

a 

rwr (^ 2 ) 2 (<T <T + g pp g v ° + g p ° g vp ). (A.42) 

(Here d is the spacetime dimension.) The integral is most conveniently evalu¬ 
ated after Wick-rotating to Euclidean space, with the substitution 

£° = ii° E , f = -1%. (A.43) 


Alternatively, one can use the following table of d-dimensional integrals in 
Minkowski space: 


/ 

/ 

/ 

/ 

/ 


d d £ 1 

(27r) d (.£ 2 - A)" 

d d £ £' 2 

(2n) d (£ 2 - A) n 

d d l £ p £ v 

(2it) d (£ 2 - A) n 

d d £ (£ 2 ) 2 

(2n) d (£ 2 - A)" 

d d l £»£ v £n a 

{2n) d (£ 2 - A)" 


(-l)”i r(n-f) /lx"-i 
( 47 r ) d / 2 r(n) VA/ 

(A.44) 

(- 1)"- 1 i d Un i 1 ) / 1 - 1 

( 47 r ) d / 2 2 T(n) VA7 

(A.45) 

(- 1 )"- 1 * < 7 "" T(n-f-l) / 1 yH ” 1 
( 47 r ) rf / 2 2 r(n) VA7 

(A.46) 

(- 1 )"* d(d+ 2 ) r(«-f- 2 ) / 1 yi-ff -2 
( 47 r ) d / 2 4 r(n) VA/ 

(A.47) 

(- 1 )"* r(n— 2 ) / 1 \ n_ f — ^ 

( 47 r ) d / 2 r(/r) VA/ 


x -Ag^g^ +g pp g v<T + g pa g vp ) 

(A.48) 


If the integral converges, one can set d = 4 from the start. If the integral 
diverges, the behavior near d = 4 can be extracted by expanding 

= l-( 2 -f)logA + .... (A.49) 

One also needs the expansion of T(x) near its poles: 

T (;c) =- 7 + 0(x ) (A.50) 

X 

near x = 0 , and 

r(;c) = f—- 7 + 1 + ••• + - + 0(x + n)) 

n\ \x + n n > 


(A.51) 
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near x = —n. Here 7 is the Euler-Mascheroni constant, 7 ss 0.5772. The 
following combination of terms often appears in calculations: 

d. 


J 2 

( 47 t ) 2 V e 


2 1 7 - log A - 7 4- log( 47 r) + 0(e ) ), (A.52) 


r (2—f) / 1 \ 2 ~f _ 1 

(47r) d / 2 VA/ (47T 

with e = 4 — cl. 

Notice that A is positive if it is a combination of masses and spacelike mo¬ 
mentum invariants. If A contains timelike momenta, it may become negative. 
Then these integrals acquire imaginary parts, which give the discontinuities 
of S'-matrix elements. To compute the S-matrix in a physical region, choose 
the correct branch of the function by the prescription 

d 


(1 \ n ~i 

(a) ^ 


1 y*- 2 

— ie; 


(a^> 


(A.53) 


where —ie (not to be confused with e in the previous paragraph!) gives a tiny 
negative imaginary part. 

Traces in Eq. (A.27) that do not involve 7 0 are independent of dimen¬ 
sionality. However, since 

9 ,lv 9,, = 6% = d (A.54) 

in d dimensions, the contraction identities (A.29) are modified: 


Tin = d 

''f'fln = ~(d~ 2 ) 7 " 

/7V7, = 4 g vp - (4-d)Yl p 

7 ^ 7 "In = — 2'y <J ~j P l V + (4— d)’y V 'y Pr y a 


(A.55) 


A.5 Cross Sections and Decay Rates 


Once the squared matrix element for a scattering process is known, the dif¬ 
ferential cross section is given by 

1 (d 3 pf 1 

(A.56) 


da = 


II: 


2E A 2E B \v A -v B \ (2 tt) 3 2E f 

x \M(pa,Pb -a {Pf}) |" (27 t) 4 S ( ' 1) (p a +pb ~ T,Pf)‘ 

The differential decay rate of an unstable particle to a given final state is 

1 


dr = 


2 ,n A CTTa - Em). (A.57) 

For the special case of a two-particle final state, the Lorentz-invariant phase 
space takes the simple form 


9-/ (2 


g^)(aowio - E») = f*£ £(§£), (AJ») 
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where |p| is the magnitude of the 3-momentum of either particle in the center- 
of-mass frame. 


A.6 Physical Constants and Conversion Factors 


Precisely known physical constants: 


c = 2.998 x lO 10 cm/s 


h = 6.582 x 10 


-22 


MeV s 


e = -1.602 x 10 

e 2 1 

a = 


-19 c 

= 0.00730 


G F 


Antic 137.04 
= 1.166 x 10“ 5 GeV -2 


The values of the strong and weak interaction coupling constants depend on 
the conventions used to define them, as explained in Sections 17.6 and 21.3. 
For the purpose of estimation, however, one can use the following values: 


a*(10 GeV) = 0.18 


a s (mz) = 0.12 
sin 2 0 W = 0.23 


Particle masses (times c 2 ): 


e : 0.5110 MeV 
// : 105.6 MeV 

r : 1777 MeV 

W + : 80.2 GeV 

Z° : 91.19 GeV 


p : 

938.3 MeV 

n : 

939.6 MeV 

7 : 

139.6 MeV 

7T° : 

135.0 MeV 


Useful combinations: 


Bohr radius: uq 

electron Compton wavelength: A 

classical electron radius: r e 

Thomson cross section: gt 

annihilation cross section: 1 R 


am e c 

n 


= 5.292 x 10“ 9 cm 


= 3.862 x 10 -11 cm 


m e c 

ati 

m e c 

8irr: 


= 2.818 x 10“ 13 cm 


= 0.6652 barn 


Ana-' 


86.8 nbarn 


3£ 2 m (E cm in GeV) 2 
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Conversion factors: 


(1 GeV)/c 2 = 1.783 x 1CT 24 g 
(1 GeV) -1 (Tic) = 0.1973 x 10 -13 cm = 0.1973 fm; 

(1 G eV)-' 2 (hc) 2 = 0.3894 x 10 -27 cm 2 = 0.3894 mbarn 
1 barn = 10 -24 cm 2 

(1 volt/meter) (ehc) = 1.973 x 10 -25 GeV 2 
(1 tesla)(e7ic 2 ) = 5.916 x 10 -17 GeV 2 


A complete, up-to-date tabulation of the fundamental constants and the prop¬ 
erties of elementary particles is given in the Review of Particle Properties , 
which can be found in a recent issue of either Physical Review D or Physics 
Letters B. The most recent Review as of this writing is published in Physical 
Review D50, 1173 (1994). 
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